Archive for February, 2008

Film Reviews

Friday, February 29th, 2008

It’s film binge time again, so here’s some brief reviews.

The Darjeeling Limited

Excellent. If you like Wes Anderson – and he basically just makes different versions of the same film – you’ll love it. I’m not an automatic fan – I love some of his films, but others never appealed. But I like this one a lot. The cinematography is just fantastic, and if Owen Wilson isn’t the best looking guy in the world I don’t know who is. 9/10

Rambo 4

Fucking awful in every way. Unwatchably bad. Ludicrous plot made somehow even more annoying by its attempt at being heartwarming. Even the action sucks. Just fucked, don’t waste your time. 2/10

There Will Be Blood

Great. More fantastic cinematography, excellent acting, especially from the lead, and a hypnotising soundtrack. Not the typical type of plot – more like an account, a record answering the question “how did this lead to this” – but a brilliantly executed one. Definitely worth your time if you can handle it. 8/10

No End In Sight

A decent documentary on the incompetence, hubris and arrogance leading to the USA’s spectacular balls-up of the Iraq war. Not bad but I was uncomfortable watching it – despite my wholehearted agreement with pretty much everything in the film, I didn’t appreciate the air of cherry picking and foregone conclusions the film has. I’d like to see Rumsfeld charged with war crimes as much as anyone, but deliberately interspersing B-roll footage of him laughing at jokes with harrowing footage of death and destruction is a clumsy technique and will only reduce the film’s credibility in the eyes of those who need to see it most.

Furthermore, the film is already out of date – America’s “surge”, otherwise known as “bringing the troop levels up closer to what they should have been from day one”, has actually been very successful, and the whole Iraq adventure has slowly been dragging itself out of its initial disgrace.

And one wonders whether it’s too little too late – and who the audience is, and what it’s trying to say? The US military has learned its lesson by now. Bush is going in a few months no matter what. McCain, the new Republican candidate, was strongly critical of the war’s management.

Worth watching, but. 6/10

Taxi To The Dark Side

Another war documentary, this time on the topic of “enemy combatant” detentions. Very disturbing, although if you follow the issue you know it all already. Much more worrying than No End In Sight. 7/10

mongrel vs. thin

Thursday, February 28th, 2008

So, we’ve got a promising new web server in the ruby world – thin. Fantastic news, and it’s using some excellent libraries – the brilliant eventmachine, and the ragel HTTP parser from mongrel (ie, the only good thing about mongrel) – both of which I am using in other projects. Very promising, looks well designed, maintainable and clean code. Unlike mongrel.

So, there’s only one thing we care about with web servers. What’s the performance?

Here’s some stats on an actual serious rails app doing actual work with an actual database and everything. Production mode and I gave each server a couple of “warm-up” runs before the run copied below. I’m just going to include the whole output.

$ ab -n 1000 -c 50 http://0.0.0.0:3000/
 
Server Software:        Mongrel
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /
Document Length:        1418 bytes
 
Concurrency Level:      50
Time taken for tests:   18.633091 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      1779000 bytes
HTML transferred:       1418000 bytes
Requests per second:    53.67 [#/sec] (mean)
Time per request:       931.655 [ms] (mean)
Time per request:       18.633 [ms] (mean, across all concurrent requests)
Transfer rate:          93.22 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.9      0      10
Processing:    23  909 130.8    927    1763
Waiting:       19  908 131.1    927    1763
Total:         23  909 130.1    927    1763
 
Percentage of the requests served within a certain time (ms)
  50%    927
  66%    936
  75%    944
  80%    949
  90%    973
  95%   1061
  98%   1067
  99%   1069
 100%   1763 (longest request)
 
Server Software:        thin
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /
Document Length:        1418 bytes
 
Concurrency Level:      50
Time taken for tests:   18.120868 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      1746000 bytes
HTML transferred:       1418000 bytes
Requests per second:    55.18 [#/sec] (mean)
Time per request:       906.043 [ms] (mean)
Time per request:       18.121 [ms] (mean, across all concurrent requests)
Transfer rate:          94.09 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   2.7      1      31
Processing:   683  893 123.6    881    1710
Waiting:      605  721 120.3    718    1473
Total:        683  894 123.7    882    1711
 
Percentage of the requests served within a certain time (ms)
  50%    882
  66%    888
  75%    902
  80%    908
  90%    961
  95%   1002
  98%   1474
  99%   1710
 100%   1711 (longest request)

Hm, nothing much in that – an insignificant improvement. Basically, that’s Rails slowness we’re measuring here. Let’s cut most of that out of the picture and go straight for a “pulse” controller – returns absolutely nothing but the two-byte string “OK” meaning that it’s actually running. I implemented that for monitoring without hitting the front page every few seconds with monit. Let’s take a look:

Server Software:        Mongrel
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /api/pulse
Document Length:        2 bytes
 
Concurrency Level:      50
Time taken for tests:   8.405170 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      269000 bytes
HTML transferred:       2000 bytes
Requests per second:    118.97 [#/sec] (mean)
Time per request:       420.259 [ms] (mean)
Time per request:       8.405 [ms] (mean, across all concurrent requests)
Transfer rate:          31.17 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.6      0       4
Processing:    14  412  72.4    399     810
Waiting:       13  411  72.3    398     808
Total:         14  412  72.0    399     810
 
Percentage of the requests served within a certain time (ms)
  50%    399
  66%    470
  75%    477
  80%    480
  90%    486
  95%    489
  98%    491
  99%    493
 100%    810 (longest request)
 
 
Server Software:        thin
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /api/pulse
Document Length:        2 bytes
 
Concurrency Level:      50
Time taken for tests:   6.65994 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      236000 bytes
HTML transferred:       2000 bytes
Requests per second:    164.85 [#/sec] (mean)
Time per request:       303.300 [ms] (mean)
Time per request:       6.066 [ms] (mean, across all concurrent requests)
Transfer rate:          37.92 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   1.2      1      10
Processing:   200  298  49.2    278     460
Waiting:       54  239  47.5    224     404
Total:        202  299  49.0    278     460
 
Percentage of the requests served within a certain time (ms)
  50%    278
  66%    294
  75%    354
  80%    356
  90%    362
  95%    372
  98%    457
  99%    460
 100%    460 (longest request)

That’s much more of an improvement.

I’ve tested it at some length now and have encountered no stability problems – not that it takes much to beat The CrashMaster™ mongrel. In fact, I have no love for mongrel at all (please see my recent posts on hacking it so it doesn’t refuse to launch upon encountering its own PID files from previous crashes) and so I’m switching to thin, effective now. I’ll let you know how it goes!

UPDATE: Yet another potential competitor has emerged – ebb, which appears to be even faster than Thin in (artificial) benchmarks. However, it’s definitely not ready for prime time – it ran an order of magnitude slower than Thin in my initial testing, and I eventually force quit it after becoming unresponsive. Still, those benchmarks paint a pretty promising picture of what a C implementation can do. Here’s my bad results anyway.

$ ebb_rails start -e production
$ ab -n 100 -c 5 http://0.0.0.0:3000/api/pulse # i reduced the numbers
 
Server Software:        
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /api/pulse
Document Length:        2 bytes
 
Concurrency Level:      5
Time taken for tests:   49.116267 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      18800 bytes
HTML transferred:       200 bytes
Requests per second:    2.04 [#/sec] (mean)
Time per request:       2455.813 [ms] (mean)
Time per request:       491.163 [ms] (mean, across all concurrent requests)
Transfer rate:          0.37 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.8      0       8
Processing:   442 2394 552.0   2379    3699
Waiting:      442 2394 552.1   2378    3699
Total:        442 2394 552.1   2379    3699
 
Percentage of the requests served within a certain time (ms)
  50%   2379
  66%   2661
  75%   2795
  80%   2835
  90%   2997
  95%   3481
  98%   3665
  99%   3699
 100%   3699 (longest request)

Obviously something is horribly wrong with that result.

Song of the Moment

Tuesday, February 26th, 2008

Sonne, by Rammstein. One needs a heavy metal soundtrack to one’s detailed fantasies about walking through the Macintosh Business Unit and shooting everyone in there to pieces, and what better heavy metal is there than Rammstein?

Another old rip, by the way, continuing the recent trend.

Yet another comment on an MS blog

Tuesday, February 26th, 2008

Microsft always silently delete my stream-of-pure-hate comments on their blogs, so I typically copy them and paste them here.

Dear assholes,

OK, OK – I get it. Microsoft’s strategy is to release crashy, buggy Mac software years late with half the capabilities of the Windows version. I understand you want to make the Mac look bad, I really do. I hate the practise but I know why you do it – cold hard cash. I guess I’d have difficulty saying no to all those billions of dollars, too.

But geeze, I think you’re overdoing it a little these days. I installed the student edition for my friend. She uses it to start editing a document. Being a girl, and using a Mac, she doesn’t save obsessively like me. An hour into it, she finally saves. Office 2008 crashes and loses her work. I search for some kind of autosave file. There’s nothing. An hour of effort, and a lot of peace of mind, gone.

This shameful garbage should never have been released, and you’re the worst programmers in the world. I know I know – it’s (probably) deliberate. But I wish you’d understand that you hurt normal people, waste their time, lose their work, in your pursuit of that one more Windows sale – and boy, I hate you – all of you.

Right, I know this isn’t going to get past moderation, but I just wanted to tell one of you that. Seriously – I loathe you. You’re scum. Rich scum maybe, but scum all the same.

Either you’re unbelievably incompetent or just malicious. Either way, the world would be better off without you.

love,

Sho

I wonder if these fantasies of mass murder I always seem to have while contemplating Microsoft’s sins against me through the years are normal?

Hm, no actual death threats in this comment – am I losing my touch?

UPDATE: More MS-bashing:

Office 2008 is the product of four years of our lives

4 years of your lives? Maybe for the next 4 years you could enrol in a college course and learn how to program fucking computers properly.

It’s surprisingly cathartic, leaving nasty comments on blogs!

Australian Politics

Friday, February 22nd, 2008

Man, I love Australian politics. I don’t know any other parliament in the world where the opposition would do something like this:

Kevin Rudd Cardboard Edition

That’s a life-size cardboard cut-out of the current PM of Australia. Lol.

Comic Life

Thursday, February 21st, 2008

After playing with the excellent comic layout program Comic Life for 10 minutes, I felt the artwork I had created was too precious to be denied to this haunted, aching world, so here it is:

Behold my masterwork, peon, and weep

Farewell, XServe RAID, I never bought thee

Wednesday, February 20th, 2008

Damn! Apple’s canned the XServe RAID. I’d always wanted one of them, but could never quite afford it. It’s been looking a little long in the tooth lately, but I’d hoped they would ship an updated one with iSCSI and ZFS support to replace it .. Guess not.

Sayonara, XServe RAID :’(

XServe RAID

Such a long time

Monday, February 18th, 2008

Sigh, looking at creation dates of files is making me feel old:

Can't Stop Fallin' in Love

SoundJam, for those who don’t know, was the software Apple bought and turned into iTunes later in 2001.

More than seven fucking years ago! And that’s not the earliest by a long shot – I have mp3 files going back to 1996, and earlier file formats before that.

Sigh!

StrokeDB

Tuesday, February 12th, 2008

Another competitor to the exciting CouchDB project has emerged – and this time it’s in pure Ruby, not god damn Erlang, so it’s very interesting to me. Check it out here.

By the way, another project I’ve talked about before, ThingFish, has been through countless revisions and code growth – there’s not a day that goes by when I’m not treated to 50 or so changed files in response to my svn up. But the focus of the project seems to have changed from being a document-centric database to being some kind of network file store. None of it works, as far as I can tell, and I have no idea what they are doing.. Thingfish developers: what on earth is your project for?

Anyway, exciting times for this class of database, which I strongly believe is the future of large-scale web apps.

hacking native UUID support into Schema:Dump

Monday, February 11th, 2008

Want to use PostgreSQL’s native UUID datatype but AR won’t let you use it with migrations?

/Library/Ruby/Gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/postgresql_adapter.rb:

# insert into
def simplified_type
  # UUID type
  when /^uuid$/
  :uuid
# insert into
def native_database_types
  :uuid      => { :name => "uuid" },

Well, that’ll get your data OUT of the database, but AR will throw a fit when you try to load it back in unless you also add uuid into the range of column types TableDefinition will accept:

in /Library/Ruby/Gems/1.8/gems/activerecord-2.0.2
/lib/active_record/connection_adapters/abstract/schema_definition.rb:

# insert into
def column(name, type, options = {})
 
%w( string text integer float decimal datetime timestamp time date binary boolean uuid ).each do |column_type|

Now you can do this:

    t.uuid     "uuid",   :null => false

About the nastiest possible hack you can do but works in/out. Here’s a patch if you don’t want to do it yourself, but no guarantees.

UPDATE:

And don’t forget to write your migrations like this to stop AR from inserting its “helpful” id columns with autoincrementing serials which your DB doesn’t need and can’t use:

  def self.up
    create_table :transactions, :id => false do |t|
      t.uuid     "id",  :null => false
      t.timestamps
    end
  end

UPDATE 2:

I now do not recommend doing this. It’s more trouble than it’s worth. There is very little you gain in forcing native UUID type in Postgres, and the complexity, hacks, loss of cross-platform compatibility and general annoyance you face are just not worth it.

Just use a string class for any UUIDs. Of course, the final hint on this page – the no-id switch for migrations – is still useful and you should use that.

Strict databases are great for discipline

Friday, February 8th, 2008

Oh, boy. Nothing exposes one’s shitty programming habits like using a new, stricter databse like PostgreSQL. All over the place I’m discovering, and fixing, code in which I’d demonstrated a lax attitude to data integrity – from trying to insert invalid data into columns I’d previously specified should not be null (datetimes, mostly) to declaring booleans with 1 or 0 instead of true and false.

It’s annoying, sure, but it’s also quite satisfying. A database that won’t take any shit is great for your programming discipline and must lead to better, more reliable code. I really can’t believe how much MySQL allows you to get away with, and I’m very glad my eyes were opened to the problems with the data I was storing before I had to manually repair million-line tables in production.

I have discovered a few annoyances with PostgreSQL, though – just to be fair. Its sequence system is pretty silly – after importing data you have to then go and explicitly set new sequence numbers for any autoincrement fields (PgSQL calls them “serials”). A useful facility but I think the default should just be to increment the sequence on import. I have of course written a script to automate this but still.

Another complaint is regarding the security, which if anything seems *too* strict. When you use Schemas to share logical databases between multiple users, any user but the owner of the “foreign” schema must have privileges granted explicitly on not only every table they plan to use in that schema, but on the aforementioned sequence tables too! I can understand the first, kind of, although there should be an option to grant privileges at the schema level, but the second is just silly – if you have rights to add new rows to a table, it is implied you should also have rights to increment the sequence. A needless bit of complexity.

That said, I’m overall delighted with the migration and everything was up and running fine. It’s not now, since I decided to completely excise my nasty multi-database hacks and simplify the data structures, removing all potential conflicts and separating tables into their logical homes. I’m about half way through doing that. And again, I’m really happy that I’m doing this now – what may take a day or two with development-stage databases might take weeks to do with live production servers – not to mention all the code that would have been built on top of the original suboptimal data structure. I’d actually been just about to write a whole lot more of that – something I’d actually been putting off because I knew what a mess the databases were, and was reluctant to dig myself any deeper a hole – but now I’m really looking forward to flying through what should be a much simpler, more intuitive job.

Switching to PostgreSQL

Thursday, February 7th, 2008

I have decided to move my development efforts from MySQL to PostgreSQL. Why? There’s a number of reasons, but there’s one main reason:

Schemas.

The concept of the schema is pretty unknown in the MySQL world. I admit I’d pretty much forgotten they existed even though I’ve learnt about them in the past setting up other databases (MS SQL Server – actually a pretty good product). Anyway, in MySQL a schema is nothing but the structure of your database. In PostgreSQL, a schema is a powerful feature for creating multiple “views” into the same database, but with ability to share between them.

Here’s an example. Say you have two applications, which you want to share a Users table but still have their own tables for “local” settings. Here are your options on MySQL:

  1. Put both applications into the same database, mixing the tables in with each other, perhaps with different prefixes for the tables, and overriding in the case of Users. Make Users a giant catch-all table with preferences for both apps, with a namespace for those fields inside the table. Pros: easy, can join into the shared table. Cons: Security is poor (I want to grant on a per-database level, not per-table), ugly as hell.
  2. Put each application inside its own database and make a third database for shared tables. Set your app to normally look inside its own database, and connect to the shared database when it needs to access the Users table. Pros: Better security compartmentalisation. Better looking, more intuitively named tables. Possibility of easier scaling since you can host the DBs on different machines. Cons: Loss of ability to join into the shared tables without nasty hacks. Constrains the kind of lookups you can do without severe performance penalties. More complex, loss of a single authorative logfile.
  3. Like number 2 but replicating the shared tables into and out of both apps by any of a number of means. Pros: solves the problem nicely. Cons: Complex, nasty solution which seems to be asking for trouble.

For the record, I’ve tried all three. I’d settled on number 2 as the better of three evils.

Here’s what you would do on PostgreSQL:

Create a single database with 3 users and three schemas. Name the Users App1, App2 and Shared, and the Schemas likewise, granting access to the matching users. Create the shared tables in the Shared schema, and the App1 and App2 tables in their schemas. Note that as far as the Schemas are concerned, they are in their own little world – no namespace conflicts.

Now set App1 and App2’s search paths to read App1/App2,Shared. There you go – as far as App1 and App2 is concerned, the table is right there – no complexity required. Set your app to use the appropriate schema and you’re done. It’s like editing your path in unix.

This might seem like overkill for such a small issue – but actually I’ve got a number of shared tables and more apps than that. The ability to use Schemas to solve all my problems here is a godsend, one that I wish I’d thought of earlier.

PostgreSQL has some other nice features as well, such as TableSpaces, which allows easy distribution of its storage by table onto different disks: you might want to put your ultra-high-activity Users table on the fast but expensive and small SCSI disk, for example, and the much larger but lover volume CatPictures table on a big, cheap SATA drive. There’s support for millisecond timestamps – MySQL, unbelievably, doesn’t go beyond 1 second accuracy. I’ve mentioned the much more strict SQL syntax requirements below – it’s taken me hours to clean up a lot of the junk MySQL happily allowed me to store (although I’m not going to claim it wasn’t my own fault; it was). And the new native data type of UUID makes me very happy, since I’ve come to believe that basically everything in a database of any important should have a UUID (synchronising two databases on different continents primary keyed on an integer = nightmare, keyed on a UUID = doable). And the backup facilities are far improved – it can easily write out full transaction logs while live, allowing full recoverability – something I’d been pretty worried about with MySQL. And its user rights system seems much more intuitive than MySQL’s.

I’d resisted PgSQL for quite some time, but one by one those reasons have disappeared. For one, it always had a reputation for being slow – now pretty thoroughly disproved. It seemed quite alien and unfamiliar, and I had trouble even getting it running the last time I tried it. Well, either I’ve become more knowledgeable or it’s easier to install, because I had no problems at all this time. And I worried that I didn’t know how to configure it properly – I discarded this reason upon realising I don’t really know jack shit about configuring MySQL properly either, and MySQL has hundreds of opaque options I know next to nothing about. In fact, I’ve had more trouble with MySQL! Even now, I can’t seem to force the MySQL on my local machine here to reliably use UTF8 as its default language in all situations.

I definitely won’t be deleting MySQL or anything like that. MySQL works fine for this blog and a few others. My MediaWiki is installed on it, plus a number of other apps use it. I’m of the “if it ain’t broke, don’t fix it” school when it comes to things like this so I’m just going to run them concurrently for the time being. I have nothing against MySQL, it’s served me extremely well for years – but the Schemas feature was the straw that broke the camel’s back.

I still don’t know for sure if I’ll stick with it – a horrible problem may well emerge, but one thing is for sure: I’ll keep all my data in a portable format from now on. MySQL is extremely permissive (or, dare I say, “lax”) with its enforcement of SQL syntax requirements and 90% of the time it’s taken to migrate has been in ad hoc repairs to tables and data to get them to conform. Now that’s done, I’m going to keep it done, and it’s easy to move back to MySQL at any time should the need arise. A bit of subtle vendor lock-in by MySQL, or simply making it “easier” for developers? Well, my thoughts on violating the standards to make it “easier” are pretty well known (see: any previous rant about Internet Explorer) so I’ll stick with the standards every time.

In conclusion: if you have the time, need and inclination I’d recommend giving PgSQL 8.3 a try.

5 cables at once?

Thursday, February 7th, 2008

I don’t think I’m the conspiracy type, but for five cables in the Middle East area to go offline simultaneously is pretty suspicious.

Here’s my suspicion-ometer:

1 cable offline: Bad luck, but happens all the time.
2 cables offline: Really bad luck, but it happens.
3 cables offline: Wow. Terrible luck. Almost getting suspicious!
4 cables offline: Right, now this is suspicious
5 cables offline: Unprecedented, and really suspicious.

That’s an awfully big coincidence. Especially when the Iranian Oil Bourse is supposed to be launching in 4 days’ time. If nothing else, it’s provoking a lot of interesting discussion on what could be a new front in “firm power” (well, firmware is between hardware and software, right?).

I guess we’ll know for sure when the sixth one starts experiencing “difficulties” too! Give it a couple of days, submarines aren’t fast you know.

Sky fixed

Thursday, February 7th, 2008

SYDNEY, AUSTRALIA: Wet sources report that the sky over Sydney, especially its popular rain module, which had suffered several days’ downtime due to overload in its much-loved “dump billions of litres of water on everyone with about 30 seconds notice” function, has been repaired and is back to full operation.

The sky, which did nothing but fucking rain for a week on end, finally broke down on Tuesday. After a restart a brief period of rain was observed yesterday but evidently the problem was not fixed as the rain only lasted a short while, and lacked the characteristic window-shaking cracks of thunder and cat-drowning downpour to which all Sydneysiders had become so fondly accustomed.

Sources inside Insane Weather Corporation, the company tasked with providing the majority of Sydney’s weather, report that the problem had finally been located and fixed in the early hours of Thursday morning. To compensate for the downtime, they promise a “fucking apocalypse” of bad weather for Sydney, starting about 20 minutes ago and continuing for the foreseeable future. Although their Q1 2008 goal of “washing the whole fucking city away” remains unfulfilled at this time, IWC officials expressed confidence in their infrastructure and reasserted their intention to flood the entire fucking world by March.

The Sun, which has previously ruined many enjoyable days of rain-watching with its beams of happiness-inducing light, is reportedly fully obscured, with daylight illumination dropping back down to the twilight gloom favoured by all.

More reports as they arrive.

TrueCrypt 5 for OSX

Thursday, February 7th, 2008

Hot on the heels of the port OSXCrypt, TrueCrypt finally comes out with their 5.0 release – complete with Mac GUI. After a long, long time with nothing, suddenly the mac has an embarrassment of riches when it comes to encryption.

Anyone concerned with security and privacy should never let their data leave their house unencrypted, and suddenly we have two great options native on the mac. Check out TC5’s screenshots, and download here. Note that if you do download it, you’ll need to rename the resulting file from .dmg.bz2 to just .dmg due to a misconfigured web server on their end – a common problem, unfortunately, but forgivable since this is their first mac release.

My hope now is that the OSXCrypt team don’t give up with their project – their goal of creating a free general “platform” for encryption on OSX is very interesting and I’d hate to see it cut off just like that. Furthermore, their approach (native kernel module) promises more flexibility and performance than a MacFUSE implementation like TC’s can deliver. For example, it seems that an EFI plugin – allowing full-disk encryption – would be easier with a proper kernel module.

Anyway, the last month has seen a great leap in privacy and security on the mac. Let’s hope it continues!

UPDATE: In case there’s anyone who doesn’t understand why anyone would want to maintain a plausible-deniability encryption regime for their sensitive date, just read this current Slashdot Thread: U.S. Confiscating Data at the Border.

Rails: Dump and reload data, unicode safe

Wednesday, February 6th, 2008

Behold my rake tasks to dump, and then reload, the contents of your database – all in highly compatible schema.rb and YAML formats. A mere rake dump_utf will create two files in /db/dump/ : firstly, an independent schema dump (doesn’t touch your proper one) and secondly a YAML file which is essentially a giant serialised hash of your DB. Running rake load_utf will import schema.rb and then all your data. And unlike every other script of this type I’ve seen around the net, it actually works, and is unicode safe.

Note that load_utf is extremely destructive and will write straight over your DB without asking further permission. However, if you haven’t run dump_utf it won’t find its files anyway, so not to worry.

Thanks to Tobias Luetke whose blog post was the starting point for this script, although there’s nothing left of it but the SQL Query now.

Needless to say, a great use of this tool is if you’re changing databases. Simply run dump_utf, modify database.yml to point to your new DB, then run load_utf – done.

Oh and I wouldn’t run it if your DB is too big, since it stores it all in memory. I may change that. And it doesn’t handle multiple databases either, I want to change that too ..

require 'Ya2YAML'
 
task :dump_utf => :environment do
  sql  = "SELECT * FROM %s"
  skip_tables = ["schema_info"]
  dir = RAILS_ROOT + '/db/dump'
  FileUtils.mkdir_p(dir)
  FileUtils.chdir(dir)
 
  ActiveRecord::Base.establish_connection
 
  puts "Dumping Schema..."
 
  File.open("structure.rb", "w+") do |file|
    ActiveRecord::SchemaDumper.dump(ActiveRecord::Base.connection, file)
  end
 
  giant_hash = {} # we're gonna put EVERYTHING in here!
 
  (ActiveRecord::Base.connection.tables - skip_tables).each do |table_name|
    giant_hash[table_name] = ActiveRecord::Base.connection.select_all(sql % table_name) 
    puts "Reading #{table_name}..."
  end
  puts "Writing file..."
  File.open("backup.yml", 'w+') do |file|
    file.write giant_hash.ya2yaml
 end
 puts "Finished!"  
end
 
task :load_utf => :environment do
  dir = RAILS_ROOT + '/db/dump/'
  FileUtils.chdir(dir)
 
  puts "loading schema..."
 
  file = "structure.rb"
  load(file)
  puts "done! now loading data ..."
 
  content_file = YAML.load_file(dir + "backup.yml")
 
  content_file.keys.each do |table_name|
    print "loading #{table_name}"
    content_file[table_name].each do |record|
    ActiveRecord::Base.connection.execute "INSERT INTO #{table_name} (#{record.keys.join(",")}) VALUES (#{record.values.collect { |value| ActiveRecord::Base.connection.quote(value) }.join(",")})", 'Insert Record'
    print "."
    end
    puts
  end
  puts "Finished!"  
end

Reserved words in PostgreSQL

Wednesday, February 6th, 2008

Trying out PostgreSQL? You might hit some troubles importing your MySQL datasets. MySQL is far more lenient about reserved words; you might find you’ve inadvertently named your columns in a way that’ll make PgSQL scream in pain.

Here’s the hard-to-find list, some obvious (SELECT, WHERE), some not (DESC – this one got me, ORDER – same, CURRENT_USER etc):

CREATE CURRENT_DATE CURRENT_ROLE CURRENT_TIME CURRENT_TIMESTAMP
 CURRENT_TRANSFORM_GROUP_FOR_TYPE CURRENT_USER DATE DEFAULT 
DEFERRABLE DESC DISTINCT DO ELSE END EXCEPT FALSE FOR FOREIGN 
FROM GRANT GROUP HAVING IN INITIALLY INTERSECT INTO IS ISNULL 
JOIN LEADING LEFT LIKE LIMIT LOCALTIME LOCALTIMESTAMP NEW NOT 
NOTNULL NULL OFF OFFSET OLD ON ONLY OR ORDER OUTER OVERLAPS 
PLACING PRIMARY REFERENCES RETURNING RIGHT SELECT 
SESSION_USER SIMILAR SOME SYMMETRIC TABLE THEN TO TRAILING
 TRUE UNION UNIQUE USER USING VERBOSE WHEN WHERE WITH

Ruby web server in 7 lines

Monday, February 4th, 2008

OK OK it’s not exactly full featured. It listens on port 8080 and spits out exactly one line of invalid HTML. But it’s still pretty cool IMO : )

1
2
3
4
5
6
require 'socket'
server = TCPServer.new("127.0.0.1", 8080)
while socket = server.accept
  socket.puts "The current time is: " + Time.now.to_s
  socket.close
end

Add HTML to taste. Suggestions on how to make it shorter welcome!

UPDATE: Lol, I made it 6 lines.

Force mongrel to delete stale pid files upon launch

Monday, February 4th, 2008

Mongrel, currently the mainstream Rails server, has a nasty habit of not deleting its PID files when it crashes and burns. This then occasionally stops automatic restarts because mongrel sees its old PID files, assumes it’s running (without a check) and refuses to start.

All that needs to be done is a startup check that the process mentioned in the PID file is actually running. This has been implemented in a patch here which I’ve integrated and successfully tested. If you’d like to use the file I generated, I’ve attached it below. Simply replace /path/to/gems/mongrel-1.1.3/bin/mongrel_rails with the attached file and run as normal. You’ll want to make a backup copy first, of course!

Also on the “things to do while hacking mongrel” list is to change the fucking version so it’s actually up to date as echoed in your console log. The file is lib/mongrel/const.rb line 68,

MONGREL_VERSION="1.1.3".freeze

which the maintainers obviously missed.

Mongrel is an aptly named bit of software, poorly maintained and with many long-standing patches which should have been included long ago. This particular patch fixes an annoying bug which shouldn’t exist any more – and harks back to March 2007, 10 months ago. That’s a pretty unforgivably long time not to include something as basic as this. And as far as I can tell the mongrel_rails program hasn’t changed significantly since last time – there’s no excuse for this, and other, patches to have not made it in by now. This is the third time I’ve manually patched it like this, and was only reminded to do so in the latest version (1.1.3) when I noticed – surprise! – it was again refusing to start with a stale PID.

I’d be more worried if there weren’t several promising alternative servers on the rise. Hopefully the Rails community’s dalliance with the very doglike mongrel won’t last too much longer…

mongrel_rails – full file

mongrel_stale_pid_file.patch – same thing in patch form. Run

$ cd /path/to/gems/mongrel-1.1.3/
$ patch -p0 < /path/to/mongrel_stale_pid_file.patch

Both from http://textsnippets.com/posts/show/931.

Testing (on MacOSX):

$ cd ~/Desktop
$ /usr/bin/ruby /usr/bin/mongrel_rails start -d -e development -p 3000 -a 127.0.0.1 -P /Users/sho/Desktop/mongrel.pid -c /rails/myapp
# mongrel starts ...
$ cp mongrel.pid another.pid
$ /usr/bin/ruby /usr/bin/mongrel_rails stop -P /Users/sho/Desktop/mongrel.pid
Sending TERM to Mongrel at PID 1605...Done.
$ cp another.pid mongrel.pid
$ /usr/bin/ruby /usr/bin/mongrel_rails start -d -e development -p 3000 -a 127.0.0.1 -P /Users/sho/Desktop/mongrel.pid -c /rails/myapp
** !!! PID file /Users/sho/Desktop/mongrel.pid exists, but is stale, and will be deleted so that this mongrel can run.
# mongrel starts as hoped .. success!

Server Upgrades

Monday, February 4th, 2008

As all server admins know, any upgrade that ends with a successful boot is a good upgrade. Well, here we are again, after a RAM upgrade and reboot which exposed a couple of failing-to-start services – failures which made me glad I’d stayed up til 4am to oversee the upgrade (albeit from the other side of the world) and make sure everything was OK. I’ll also be reviewing my configuration to ensure it doesn’t happen again.

Anyway, we’re now running with a full 4GB of RAM – the limit for this box. The expansion is in preparation for a number of web sites hosted on this box which will go live shortly, all in Rails, which is a ginormous RAM hog. I’ll now be able to safely boost the number of Rails instances I can simultaneously serve without running out of memory, which I’ve done before in testing, and which caused horrendous performance problems.

The upgrade will also enable continued experimentation with virtualisation upon this RHEL4 box. I’d been playing around with running RHEL5 on top of it, but 2GB of memory wasn’t a large enough play area – the extra headroom will give me more room and enable me to trial a number of things I’ve had in mind.

Anyway, big sighs of relief that the HD didn’t fail (which I basically expect whenever a server which has run continuously for a year is suddenly powered off) and I won’t have to restore from backup, which would have been a very stressful 24 hours.

Onwards and upwards, then!