Archive for the ‘rails’ Category

Porn porn porn

Saturday, January 10th, 2009

Apparently this blog is now blocked on a common content filter for its “porn”:

according to the barracuda web filter here at work, i am now denied access to your website because it falls into the category “porn”.

Well, shit. I hate to disappoint. All those surfers coming here expecting porn .. and I don’t deliver. Maybe I should start!

girls in forests

prescription: assisted hydrotherapy

sailor venus in a bikini

miu nakamura

UUIDs in Rails redux

Tuesday, April 15th, 2008

I have covered forcing ActiveRecord to respect UUID data types in Migrations before. That helps us create our database – now what about in use? We need to create the UUIDs and store them in the database.

These examples all rely on the uuidtools gem, so install that if you haven’t already (and require it somewhere in environment.rb).

1. Setting a UUID using ActiveRecord callbacks

If you don’t need the UUID in the object upon creation but only want to ensure it’s there upon save, do this. Suggestion initially from this page, changes are mine.

We will use the before_create callback to ask AR to add a UUID of our choosing before the record is saved.

Add this to your lib directory:

# lib/uuid_helper.rb
require 'uuidtools'
 
module UUIDHelper
  def before_create
    self.id = UUID.random_create.to_s
  end
end

And now include this in your models:

class Airframe < ActiveRecord::Base
  include UUIDHelper
 
  #my stuff
 
end
>> Airframe.new
=> #< Airframe id: nil, maker_id: nil>
>> Airframe.create!
=> #< Airframe id: "1a82a408-32e6-480e-941d-073a7e793299", maker_id: nil>

2. Initialising a model with a UUID

If you want the UUID in the model before save, i.e. upon initialisation, we have to get a little more fancy:

# lib/uuid_init.rb
require 'uuidtools'
 
module UUIDInit
  def initialize(attrs = {}, &block) 
   super 
   ['id'] = UUID.random_create.to_s
  end
end

Now include this in your models:

class Flightpath  < ActiveRecord::Base
 
  include UUIDInit
 
  # my stuff
 
end
>> Flightpath.new
=> #< Flightpath created_at: nil, id: "5e5bcd63-070d-4252-8556-2876ddd83b54">

Be aware that it will conflict with any other initialisation you do in there, so you might want to simply copy in the whole method if you need other fields upon initialisation:

class User < ActiveRecord::Base
 
  def initialize(attrs = {}, &block) 
   super 
   ['balance'] = 0.0
   ['id'] = UUID.random_create.to_s
  end
 
end
>> User.new
=> #

3. Sessions

All this is very well for your own models, but what about Rails’ inbuilt sessions? By default, they want an autoincrementing integer primary key.

The good news is it’s easy to override. Your migration should look like this:

create_table "sessions", :id => false, :force => true do |t|
  t.string   "session_id"
  t.text     "data"
  t.datetime "updated_at"
  t.datetime "created_at"
end

Now add this to your environment.rb file:

# config/environment.rb
CGI::Session::ActiveRecordStore::Session.primary_key = 'session_id'

And this to your Application Controller:

# app/controllers/application.rb
class ApplicationController < ActionController::Base
 
before_filter :config_session # at the top, if possible
 
def config_session
  session.model.id = session.session_id
end
 
end

And voila, your session store is using the session_id as its primary key. I don’t see any point in using a UUID for your sessions’ PK, but if you want to you’ll find an example override class in:

actionpack/lib/action_controller/session/active_record_store.rb.

Remember to drop any preexisting sessions table in your database, or it will likely complain of null ids when you switch to session_id as your primary key.

DataMapper – one ORM to rule them all?

Tuesday, April 8th, 2008

I’ve just watched the DataMapper presentation from Mountain West 2008, and it’s very interesting. I’ve been thinking we need a high-level super-abstraction library for Ruby for some time, and DataMapper (DM) looks like it might grow into fitting that bill.

What’s wrong with what we’re using now? Let me count the ways – or rather, let me count how many different sources to and from which I am reading and writing data in a Rails app of mine today:

  1. An RDMBS (Postgres), using ActiveRecord
  2. A number of resources, using ActiveResource
  3. A document-based database, using a custom i/o library
  4. YAML config files, using a constant read from a file on server load
  5. Cookies, via ActionController
  6. (not really a full point) data structure information in Model files

That’s too many. And I see no sign of this trend slowing down – why, I just found out about the new project today, yet another library to address document-based databases. We have at least three gems to read and write to CouchDB. Thingfish uses Sequel. If you use Amazon S3 that’s yet another. Enough!

It’s more and more obvious that these new developments are being written at the wrong level of abstraction. I don’t know what we can do about Cookies but all the others – RDBMS, RESTful resources, YAML files, etc – are the same types of data and should be accessible by a common API within Rails or Merb or plain Ruby or anywhere. So why can’t we access them all via a common method?

The correct way to do this is to have a single ORM at the highest possible level of abstraction. Storage adapters can be added to it as drivers. Then, if you need to split storage types for whatever reason, you can configure that on a case by case basis.

DataMapper enables this, and provides a plug-in system – if you can implement 10 or so basic actions in your storage system, and support 5 or so basic data types, you should be able to use it transparently. To me, this is a very appealing step forward. There are many types of data storage, all with their strengths and weaknesses. If we can flexibly and transparently include pretty much any useful type of storage into our programs using a single consistent API and with just a little config, once, in one place, that’s a huge win.

Why not just improve ActiveRecord? I think it’s too late for that. AR is a huge, tangled mess which many believe should just be scrapped and re-written from scratch, me included. Well, the good news is that DM has basically done that, and it’s smaller, faster, more modular, cleaner and – best of all – mostly compatible.

UPDATE: Whilst looking at the weblog of Rails Core member Rick Olson, aka Technoweenie, for info on ActiveDocument I came across this wonderfully candid comment on AR associations:

I can point to a few other more complex areas of the associations code (the crazy eager joins code is why I personally haven’t used :include in any of my apps in over a year).

Straight from the horse’s mouth! Couldn’t agree more.

Reducing LOC with ActiveRecord

Sunday, March 9th, 2008

Heh, as you can probably tell I’m going through old code with a chainsaw right now.

If there’s one thing I’ve learnt about maintainability, it’s that lines of code are the enemy. The longer a function, the less you can see on screen at once, the harder it is to understand. So with that in mind, let’s go over a few tricks you can use to shorten your code – this time in Controllers.

Have you ever done anything like this?

def create_message
  message = Message.new(params[:message])
  message.sender_id = .id # we can't trust form input
  message.language_id = .language_id
  message.something_else = Something.else
  if message_save
    redirect_to some_url
  else
   error!
  end
end

OK, this is a pretty bad example from a design perspective. Since the current user should be a model, and whose details are thus available in the Message model, it would probably be a better idea to set things like that inside the model. But let’s say we have to do it from the controller for some reason.

One limiting property of ActiveRecord is that you can’t send more than one attributes hash when you create a new object, so newbies like me when I wrote similar code to that above would create the new object and then manually append endless other properties to it before committing. And we want to keep the params hash for convenience – that’s why we’re using Rails after all!

But that params hash is just a hash, and that means we can merge with other hashes, and since we’ve got it all in one place let’s drop the .new and go straight for .create:

def create_message
  props = {:sender_id => .id, :language_id => .language_id, :something_else => Something.else}
  params[:message].merge!(props) 
  if Message.create(props)
    redirect_to some_url
  else
   error!
  end
end

Yeah, this is pretty basic stuff, not the 1337 tip you might have been expecting. But it took me a long time to get over my fear of messing with the black box of the params hash, and looking back at my old code made me remember. Anyway we dropped the LOC to get this record into the database from 5 to 3 – and it would stay at 3 no matter how much crap you added to the setup hash.

One tip won’t save you that much space, but every bit helps!

UPDATE: I had it back to front. Forgot hash merge order. >_<

Reducing code duplication with Rails ActionMailer

Sunday, March 9th, 2008

ActionMailer is a weird, fussy black box whose intricacies produce a lot of code duplication, especially for dealing with multi-language email, where you need to use a different .erb file for each language. Here’s an example of how you can reduce some of that duplication.

Code speaks louder than words so I’ll just cut and paste – you can see pretty easily what I’ve done. Works, although it’s completely undocumented and is probably not very good practise.

Previous, highly redundant code. I’ve snipped this a lot – you don’t have to use much imagination to see why I don’t like this:

def tokenmail_eng(user, subject)
  subject       subject
  body          :token => user.password_hash,
                :password => user.password,
                :user_id => user.id,
                :name => user.nickname
  recipients    user.email_address_with_name
  from          ''
  sent_on       Time.now
  headers       "Reply-to" => ""
end
 
def tokenmail_jpn(user, subject)
  subject       subject
  body          :token => user.password_hash,
                :password => user.password,
                :user_id => user.id,
                :name => user.nickname
  recipients    user.email_address_with_name
  from          ''
  sent_on       Time.now
  headers       "Reply-to" => ""
end
 
def tokenmail_zhs(user, subject)
  subject       subject
  body          :token => user.password_hash,
                :password => user.password,
                :user_id => user.id,
                :name => user.nickname
  recipients    user.email_address_with_name
  from          ''
  sent_on       Time.now
  headers       "Reply-to" => ""
end

Woah. That is *awful*. When I finally decided to clean up my mail code I found several problems even in my cut and pasted code – when you do that a few times, and then go in to make a change, you are almost guaranteed to leave something out. That is precisely why cut and pasting code is something you should never, ever do.

So how do we clean this up and merge into a “generic” mail, while fooling ActionMailer into using the correct file?

Turns out we can do this:

def tokenmail_gen(user, subject)
  subject       subject
  body          :token => user.password_hash,
                :password => user.password,
                :user_id => user.id,
                :name => user.nickname
  recipients    user.email_address_with_name
  from          ''
  sent_on       Time.now
  headers       "Reply-to" => ""
end
 
def tokenmail_eng(user, subject)
 tokenmail_gen(user, subject)
end
 
def tokenmail_jpn(user, subject)
 tokenmail_gen(user, subject)
end
 
def tokenmail_zhs(user, subject)
 tokenmail_gen(user, subject)
end
 
def tokenmail_zht(user, subject)
 tokenmail_gen(user, subject)
end

That’s so much better it brings a tear to my eye. Needless to say you will need to handle translation of the subjectline in your Controller.

SERIOUS NOTE: This is undocumented and while it seems to work, I haven’t heard of anyone else doing it and it might not work in future. Use at your own risk and make sure your tests cover it!

Eval, friend and enemy

Sunday, March 9th, 2008

The prevailing wisdom in the Ruby world is that using eval is bad. It’s brittle, hackish and inflexible – if you find yourself using it, it’s usually a sign you’re doing something wrong.

Well, I agree, but the keyword is “usually”. Sometimes using eval is a lot better than any alternatives, and it can help work around problems in, say, other people’s code, where the design decisions they made stop you doing what you want. I avoided using it for a long time, maybe based on my acceptance of the consensus opinion – but recently I’ve had cases where it’s the “lesser of two evils”, and I wanted to share one with you.

The example is Rails’ horrible mailing system. It’s one of the worst parts of Rails – but let me temper that by saying it’s a hard problem and difficult to see how else they could have done it. A Mailer in Rails is kind of a pseudo-model black box that waits to be sent a list of parameters, then inserts some of them into a specific text file, then sends that text using the remainder of the parameters.

All very well and good so far. But the problem with this is when you want to have mail in multiple languages. So you don’t just have

Notifier.deliver_welcome_mail(params)

you have

Notfier.deliver_welcome_mail_eng(params)
Notfier.deliver_welcome_mail_jpn(params)
Notfier.deliver_welcome_mail_spa(params)

etc. All with corresponding entries in the model, and matching text files.

Now, I don’t know any way to get around the need for those text files and entries in the model that doesn’t involve writing some huge text generator (bad) or hacking Rails itself (even worse). But we can at least improve on the statements used to call these deliveries.

Here’s an example of what I used to have:

if .language_iso == 'zht'
  Notifier.deliver_invite_zht(, , subject)
elsif .language_iso == 'zhs'
  Notifier.deliver_invite_zhs(, , subject)
elsif .language_iso == 'jpn'
  Notifier.deliver_invite_jpn(, , subject)
else
  Notifier.deliver_invite_eng(, , subject)
end

That’s a shortened version of what can be much longer if or case statements (this is pretty old code..). But you can see this gets very nasty, very quickly.

Using eval we can change that to:

eval("Notifier.deliver_invite_#{.language_iso}(, , subject)")

Yes, it’s still nasty. But it’s one line of nastiness as opposed to 10 (or many more). I think this is one example of a case where using eval is definitely better than the alternative.

Next, I’m experimenting with trying to reduce the horrendous duplication in the Mailer pseudomodel itself.

mongrel vs. thin

Thursday, February 28th, 2008

So, we’ve got a promising new web server in the ruby world – thin. Fantastic news, and it’s using some excellent libraries – the brilliant eventmachine, and the ragel HTTP parser from mongrel (ie, the only good thing about mongrel) – both of which I am using in other projects. Very promising, looks well designed, maintainable and clean code. Unlike mongrel.

So, there’s only one thing we care about with web servers. What’s the performance?

Here’s some stats on an actual serious rails app doing actual work with an actual database and everything. Production mode and I gave each server a couple of “warm-up” runs before the run copied below. I’m just going to include the whole output.

$ ab -n 1000 -c 50 http://0.0.0.0:3000/
 
Server Software:        Mongrel
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /
Document Length:        1418 bytes
 
Concurrency Level:      50
Time taken for tests:   18.633091 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      1779000 bytes
HTML transferred:       1418000 bytes
Requests per second:    53.67 [#/sec] (mean)
Time per request:       931.655 [ms] (mean)
Time per request:       18.633 [ms] (mean, across all concurrent requests)
Transfer rate:          93.22 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.9      0      10
Processing:    23  909 130.8    927    1763
Waiting:       19  908 131.1    927    1763
Total:         23  909 130.1    927    1763
 
Percentage of the requests served within a certain time (ms)
  50%    927
  66%    936
  75%    944
  80%    949
  90%    973
  95%   1061
  98%   1067
  99%   1069
 100%   1763 (longest request)
 
Server Software:        thin
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /
Document Length:        1418 bytes
 
Concurrency Level:      50
Time taken for tests:   18.120868 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      1746000 bytes
HTML transferred:       1418000 bytes
Requests per second:    55.18 [#/sec] (mean)
Time per request:       906.043 [ms] (mean)
Time per request:       18.121 [ms] (mean, across all concurrent requests)
Transfer rate:          94.09 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   2.7      1      31
Processing:   683  893 123.6    881    1710
Waiting:      605  721 120.3    718    1473
Total:        683  894 123.7    882    1711
 
Percentage of the requests served within a certain time (ms)
  50%    882
  66%    888
  75%    902
  80%    908
  90%    961
  95%   1002
  98%   1474
  99%   1710
 100%   1711 (longest request)

Hm, nothing much in that – an insignificant improvement. Basically, that’s Rails slowness we’re measuring here. Let’s cut most of that out of the picture and go straight for a “pulse” controller – returns absolutely nothing but the two-byte string “OK” meaning that it’s actually running. I implemented that for monitoring without hitting the front page every few seconds with monit. Let’s take a look:

Server Software:        Mongrel
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /api/pulse
Document Length:        2 bytes
 
Concurrency Level:      50
Time taken for tests:   8.405170 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      269000 bytes
HTML transferred:       2000 bytes
Requests per second:    118.97 [#/sec] (mean)
Time per request:       420.259 [ms] (mean)
Time per request:       8.405 [ms] (mean, across all concurrent requests)
Transfer rate:          31.17 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.6      0       4
Processing:    14  412  72.4    399     810
Waiting:       13  411  72.3    398     808
Total:         14  412  72.0    399     810
 
Percentage of the requests served within a certain time (ms)
  50%    399
  66%    470
  75%    477
  80%    480
  90%    486
  95%    489
  98%    491
  99%    493
 100%    810 (longest request)
 
 
Server Software:        thin
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /api/pulse
Document Length:        2 bytes
 
Concurrency Level:      50
Time taken for tests:   6.65994 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      236000 bytes
HTML transferred:       2000 bytes
Requests per second:    164.85 [#/sec] (mean)
Time per request:       303.300 [ms] (mean)
Time per request:       6.066 [ms] (mean, across all concurrent requests)
Transfer rate:          37.92 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   1.2      1      10
Processing:   200  298  49.2    278     460
Waiting:       54  239  47.5    224     404
Total:        202  299  49.0    278     460
 
Percentage of the requests served within a certain time (ms)
  50%    278
  66%    294
  75%    354
  80%    356
  90%    362
  95%    372
  98%    457
  99%    460
 100%    460 (longest request)

That’s much more of an improvement.

I’ve tested it at some length now and have encountered no stability problems – not that it takes much to beat The CrashMaster™ mongrel. In fact, I have no love for mongrel at all (please see my recent posts on hacking it so it doesn’t refuse to launch upon encountering its own PID files from previous crashes) and so I’m switching to thin, effective now. I’ll let you know how it goes!

UPDATE: Yet another potential competitor has emerged – ebb, which appears to be even faster than Thin in (artificial) benchmarks. However, it’s definitely not ready for prime time – it ran an order of magnitude slower than Thin in my initial testing, and I eventually force quit it after becoming unresponsive. Still, those benchmarks paint a pretty promising picture of what a C implementation can do. Here’s my bad results anyway.

$ ebb_rails start -e production
$ ab -n 100 -c 5 http://0.0.0.0:3000/api/pulse # i reduced the numbers
 
Server Software:        
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /api/pulse
Document Length:        2 bytes
 
Concurrency Level:      5
Time taken for tests:   49.116267 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      18800 bytes
HTML transferred:       200 bytes
Requests per second:    2.04 [#/sec] (mean)
Time per request:       2455.813 [ms] (mean)
Time per request:       491.163 [ms] (mean, across all concurrent requests)
Transfer rate:          0.37 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.8      0       8
Processing:   442 2394 552.0   2379    3699
Waiting:      442 2394 552.1   2378    3699
Total:        442 2394 552.1   2379    3699
 
Percentage of the requests served within a certain time (ms)
  50%   2379
  66%   2661
  75%   2795
  80%   2835
  90%   2997
  95%   3481
  98%   3665
  99%   3699
 100%   3699 (longest request)

Obviously something is horribly wrong with that result.

hacking native UUID support into Schema:Dump

Monday, February 11th, 2008

Want to use PostgreSQL’s native UUID datatype but AR won’t let you use it with migrations?

/Library/Ruby/Gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/postgresql_adapter.rb:

# insert into
def simplified_type
  # UUID type
  when /^uuid$/
  :uuid
# insert into
def native_database_types
  :uuid      => { :name => "uuid" },

Well, that’ll get your data OUT of the database, but AR will throw a fit when you try to load it back in unless you also add uuid into the range of column types TableDefinition will accept:

in /Library/Ruby/Gems/1.8/gems/activerecord-2.0.2
/lib/active_record/connection_adapters/abstract/schema_definition.rb:

# insert into
def column(name, type, options = {})
 
%w( string text integer float decimal datetime timestamp time date binary boolean uuid ).each do |column_type|

Now you can do this:

    t.uuid     "uuid",   :null => false

About the nastiest possible hack you can do but works in/out. Here’s a patch if you don’t want to do it yourself, but no guarantees.

UPDATE:

And don’t forget to write your migrations like this to stop AR from inserting its “helpful” id columns with autoincrementing serials which your DB doesn’t need and can’t use:

  def self.up
    create_table :transactions, :id => false do |t|
      t.uuid     "id",  :null => false
      t.timestamps
    end
  end

UPDATE 2:

I now do not recommend doing this. It’s more trouble than it’s worth. There is very little you gain in forcing native UUID type in Postgres, and the complexity, hacks, loss of cross-platform compatibility and general annoyance you face are just not worth it.

Just use a string class for any UUIDs. Of course, the final hint on this page – the no-id switch for migrations – is still useful and you should use that.

Switching to PostgreSQL

Thursday, February 7th, 2008

I have decided to move my development efforts from MySQL to PostgreSQL. Why? There’s a number of reasons, but there’s one main reason:

Schemas.

The concept of the schema is pretty unknown in the MySQL world. I admit I’d pretty much forgotten they existed even though I’ve learnt about them in the past setting up other databases (MS SQL Server – actually a pretty good product). Anyway, in MySQL a schema is nothing but the structure of your database. In PostgreSQL, a schema is a powerful feature for creating multiple “views” into the same database, but with ability to share between them.

Here’s an example. Say you have two applications, which you want to share a Users table but still have their own tables for “local” settings. Here are your options on MySQL:

  1. Put both applications into the same database, mixing the tables in with each other, perhaps with different prefixes for the tables, and overriding in the case of Users. Make Users a giant catch-all table with preferences for both apps, with a namespace for those fields inside the table. Pros: easy, can join into the shared table. Cons: Security is poor (I want to grant on a per-database level, not per-table), ugly as hell.
  2. Put each application inside its own database and make a third database for shared tables. Set your app to normally look inside its own database, and connect to the shared database when it needs to access the Users table. Pros: Better security compartmentalisation. Better looking, more intuitively named tables. Possibility of easier scaling since you can host the DBs on different machines. Cons: Loss of ability to join into the shared tables without nasty hacks. Constrains the kind of lookups you can do without severe performance penalties. More complex, loss of a single authorative logfile.
  3. Like number 2 but replicating the shared tables into and out of both apps by any of a number of means. Pros: solves the problem nicely. Cons: Complex, nasty solution which seems to be asking for trouble.

For the record, I’ve tried all three. I’d settled on number 2 as the better of three evils.

Here’s what you would do on PostgreSQL:

Create a single database with 3 users and three schemas. Name the Users App1, App2 and Shared, and the Schemas likewise, granting access to the matching users. Create the shared tables in the Shared schema, and the App1 and App2 tables in their schemas. Note that as far as the Schemas are concerned, they are in their own little world – no namespace conflicts.

Now set App1 and App2’s search paths to read App1/App2,Shared. There you go – as far as App1 and App2 is concerned, the table is right there – no complexity required. Set your app to use the appropriate schema and you’re done. It’s like editing your path in unix.

This might seem like overkill for such a small issue – but actually I’ve got a number of shared tables and more apps than that. The ability to use Schemas to solve all my problems here is a godsend, one that I wish I’d thought of earlier.

PostgreSQL has some other nice features as well, such as TableSpaces, which allows easy distribution of its storage by table onto different disks: you might want to put your ultra-high-activity Users table on the fast but expensive and small SCSI disk, for example, and the much larger but lover volume CatPictures table on a big, cheap SATA drive. There’s support for millisecond timestamps – MySQL, unbelievably, doesn’t go beyond 1 second accuracy. I’ve mentioned the much more strict SQL syntax requirements below – it’s taken me hours to clean up a lot of the junk MySQL happily allowed me to store (although I’m not going to claim it wasn’t my own fault; it was). And the new native data type of UUID makes me very happy, since I’ve come to believe that basically everything in a database of any important should have a UUID (synchronising two databases on different continents primary keyed on an integer = nightmare, keyed on a UUID = doable). And the backup facilities are far improved – it can easily write out full transaction logs while live, allowing full recoverability – something I’d been pretty worried about with MySQL. And its user rights system seems much more intuitive than MySQL’s.

I’d resisted PgSQL for quite some time, but one by one those reasons have disappeared. For one, it always had a reputation for being slow – now pretty thoroughly disproved. It seemed quite alien and unfamiliar, and I had trouble even getting it running the last time I tried it. Well, either I’ve become more knowledgeable or it’s easier to install, because I had no problems at all this time. And I worried that I didn’t know how to configure it properly – I discarded this reason upon realising I don’t really know jack shit about configuring MySQL properly either, and MySQL has hundreds of opaque options I know next to nothing about. In fact, I’ve had more trouble with MySQL! Even now, I can’t seem to force the MySQL on my local machine here to reliably use UTF8 as its default language in all situations.

I definitely won’t be deleting MySQL or anything like that. MySQL works fine for this blog and a few others. My MediaWiki is installed on it, plus a number of other apps use it. I’m of the “if it ain’t broke, don’t fix it” school when it comes to things like this so I’m just going to run them concurrently for the time being. I have nothing against MySQL, it’s served me extremely well for years – but the Schemas feature was the straw that broke the camel’s back.

I still don’t know for sure if I’ll stick with it – a horrible problem may well emerge, but one thing is for sure: I’ll keep all my data in a portable format from now on. MySQL is extremely permissive (or, dare I say, “lax”) with its enforcement of SQL syntax requirements and 90% of the time it’s taken to migrate has been in ad hoc repairs to tables and data to get them to conform. Now that’s done, I’m going to keep it done, and it’s easy to move back to MySQL at any time should the need arise. A bit of subtle vendor lock-in by MySQL, or simply making it “easier” for developers? Well, my thoughts on violating the standards to make it “easier” are pretty well known (see: any previous rant about Internet Explorer) so I’ll stick with the standards every time.

In conclusion: if you have the time, need and inclination I’d recommend giving PgSQL 8.3 a try.

Rails: Dump and reload data, unicode safe

Wednesday, February 6th, 2008

Behold my rake tasks to dump, and then reload, the contents of your database – all in highly compatible schema.rb and YAML formats. A mere rake dump_utf will create two files in /db/dump/ : firstly, an independent schema dump (doesn’t touch your proper one) and secondly a YAML file which is essentially a giant serialised hash of your DB. Running rake load_utf will import schema.rb and then all your data. And unlike every other script of this type I’ve seen around the net, it actually works, and is unicode safe.

Note that load_utf is extremely destructive and will write straight over your DB without asking further permission. However, if you haven’t run dump_utf it won’t find its files anyway, so not to worry.

Thanks to Tobias Luetke whose blog post was the starting point for this script, although there’s nothing left of it but the SQL Query now.

Needless to say, a great use of this tool is if you’re changing databases. Simply run dump_utf, modify database.yml to point to your new DB, then run load_utf – done.

Oh and I wouldn’t run it if your DB is too big, since it stores it all in memory. I may change that. And it doesn’t handle multiple databases either, I want to change that too ..

require 'Ya2YAML'
 
task :dump_utf => :environment do
  sql  = "SELECT * FROM %s"
  skip_tables = ["schema_info"]
  dir = RAILS_ROOT + '/db/dump'
  FileUtils.mkdir_p(dir)
  FileUtils.chdir(dir)
 
  ActiveRecord::Base.establish_connection
 
  puts "Dumping Schema..."
 
  File.open("structure.rb", "w+") do |file|
    ActiveRecord::SchemaDumper.dump(ActiveRecord::Base.connection, file)
  end
 
  giant_hash = {} # we're gonna put EVERYTHING in here!
 
  (ActiveRecord::Base.connection.tables - skip_tables).each do |table_name|
    giant_hash[table_name] = ActiveRecord::Base.connection.select_all(sql % table_name) 
    puts "Reading #{table_name}..."
  end
  puts "Writing file..."
  File.open("backup.yml", 'w+') do |file|
    file.write giant_hash.ya2yaml
 end
 puts "Finished!"  
end
 
task :load_utf => :environment do
  dir = RAILS_ROOT + '/db/dump/'
  FileUtils.chdir(dir)
 
  puts "loading schema..."
 
  file = "structure.rb"
  load(file)
  puts "done! now loading data ..."
 
  content_file = YAML.load_file(dir + "backup.yml")
 
  content_file.keys.each do |table_name|
    print "loading #{table_name}"
    content_file[table_name].each do |record|
    ActiveRecord::Base.connection.execute "INSERT INTO #{table_name} (#{record.keys.join(",")}) VALUES (#{record.values.collect { |value| ActiveRecord::Base.connection.quote(value) }.join(",")})", 'Insert Record'
    print "."
    end
    puts
  end
  puts "Finished!"  
end

Force mongrel to delete stale pid files upon launch

Monday, February 4th, 2008

Mongrel, currently the mainstream Rails server, has a nasty habit of not deleting its PID files when it crashes and burns. This then occasionally stops automatic restarts because mongrel sees its old PID files, assumes it’s running (without a check) and refuses to start.

All that needs to be done is a startup check that the process mentioned in the PID file is actually running. This has been implemented in a patch here which I’ve integrated and successfully tested. If you’d like to use the file I generated, I’ve attached it below. Simply replace /path/to/gems/mongrel-1.1.3/bin/mongrel_rails with the attached file and run as normal. You’ll want to make a backup copy first, of course!

Also on the “things to do while hacking mongrel” list is to change the fucking version so it’s actually up to date as echoed in your console log. The file is lib/mongrel/const.rb line 68,

MONGREL_VERSION="1.1.3".freeze

which the maintainers obviously missed.

Mongrel is an aptly named bit of software, poorly maintained and with many long-standing patches which should have been included long ago. This particular patch fixes an annoying bug which shouldn’t exist any more – and harks back to March 2007, 10 months ago. That’s a pretty unforgivably long time not to include something as basic as this. And as far as I can tell the mongrel_rails program hasn’t changed significantly since last time – there’s no excuse for this, and other, patches to have not made it in by now. This is the third time I’ve manually patched it like this, and was only reminded to do so in the latest version (1.1.3) when I noticed – surprise! – it was again refusing to start with a stale PID.

I’d be more worried if there weren’t several promising alternative servers on the rise. Hopefully the Rails community’s dalliance with the very doglike mongrel won’t last too much longer…

mongrel_rails – full file

mongrel_stale_pid_file.patch – same thing in patch form. Run

$ cd /path/to/gems/mongrel-1.1.3/
$ patch -p0 < /path/to/mongrel_stale_pid_file.patch

Both from http://textsnippets.com/posts/show/931.

Testing (on MacOSX):

$ cd ~/Desktop
$ /usr/bin/ruby /usr/bin/mongrel_rails start -d -e development -p 3000 -a 127.0.0.1 -P /Users/sho/Desktop/mongrel.pid -c /rails/myapp
# mongrel starts ...
$ cp mongrel.pid another.pid
$ /usr/bin/ruby /usr/bin/mongrel_rails stop -P /Users/sho/Desktop/mongrel.pid
Sending TERM to Mongrel at PID 1605...Done.
$ cp another.pid mongrel.pid
$ /usr/bin/ruby /usr/bin/mongrel_rails start -d -e development -p 3000 -a 127.0.0.1 -P /Users/sho/Desktop/mongrel.pid -c /rails/myapp
** !!! PID file /Users/sho/Desktop/mongrel.pid exists, but is stale, and will be deleted so that this mongrel can run.
# mongrel starts as hoped .. success!

Timezones in core Rails

Sunday, January 27th, 2008

“Official” TZ support finally appears in edge rails. Cue mixed feeling of anyone who has already implemented it.

ActiveRecord::Base.find(:last) in Rails 2

Thursday, January 17th, 2008

I don’t know about you, but when I see find(:first) I also expect there to be a find(:last). This hack was working in Rails 1.x but broke in Rails 2. So I fixed it. Paste this into environment.rb or wherever:

module ActiveRecord
  class Base
    def self.find_with_last(*args)
      if args.first == :last
        options = args.extract_options!
        find_without_last(:first, options.merge(:order => "#{primary_key} DESC"))
      else
        find_without_last(*args)
      end
    end
 
    class << self # Needed because we are redefining a class method
      alias_method_chain :find, :last
    end    
  end
end

I wouldn’t rely on this in actual production code (for a variety of reasons) but it’s a useful convenience method for script/console, which is where I tended to want this functionality anyway.

>> MyTable.find(:all).length
=> 2076
>> MyTable.find(:first).id
=> 1
>> MyTable.find(:last).id
=> 2076
>> puts "1337"
1337
=> nil

Zed Shaw goes postal

Wednesday, January 2nd, 2008

Zed Shaw, the writer of the popular essential server software mongrel, goes absolutely fucking nutzoid insane on his weblog in a 6000-word essay on what he hates about, well, pretty much everything to do with Rails.

Jesus H Christ, he really gets into it. Now I’m a big fan of dissing DHH and everything, but man, Zed just takes it to a whole new level of bitterness. And there’s some pretty scary disclosures, like when DHH admits to Shaw on IRC that he couldn’t keep 37Signal’s flagship products up for more than 4 minutes at a time:

(15:11:12) DHH: before fastthread we had ~400 restarts/day
(15:11:22) DHH: now we have perhaps 10
(15:11:29) Zed S.: oh nice
(15:11:33) Zed S.: and that’s still fastcgi right?

Recommended reading, it’s a laugh riot. Couldn’t agree with him more on most of it re. the idiots in the community.

Never, ever use IronRuby

Wednesday, December 26th, 2007

So, you’ve heard of IronRuby? Microsoft’s implementation of Ruby on .NET. I’m hearing a bit of talk about it around the interwebs. Thinking of trying it out, eh?

Don’t. Ever.

Microsoft invented the technique we now refer to as Embrace, Extend, Extinguish. They tried it to Sun Java (MS JVM). They’ve tried it with Kerberos, and even today every web developer in the world is frustrated, wasting time trying to get around MS’s attempts to poison HTML and CSS.

A Leopard doesn’t change its spots. There is only one conceivable reason for Microsoft to implement another language on its own platform, and that’s that it sees it as a threat and wants to kill it. First, make a really good implementation! Fast, easy, Visual Studio! Next, “improve” it – on Windows only, of course! Third, we’re using MS Ruby++ and we’re fucked.

Run a million miles from IronRuby. If you assist in giving it traction, if you feed it any attention at all, you’re just making the world a worse place for all of us.

And one would be advised to think very, very carefully before adopting any technology – or, just as importantly, any *implementation* of a technology – that is not open source from top to bottom. GPL is best but MIT will do.

May MS never stain the history of computing again…

Rails not working yet on Ruby 1.9 trunk

Saturday, December 15th, 2007

For those entranced by these benchmarks results and wanting to host Rails on Ruby 1.9 ASAP …

My testing shows Rails 2.0.1 failing on current svn/trunk installs of Ruby 1.9 on MacOSX 10.5.1.

But WEBrick works!

Ruby 1.9 build:

cd ~/src
svn co http://svn.ruby-lang.org/repos/ruby/trunk ruby-1.9
cd ruby-1.9
autoconf
./configure --prefix=/usr/local/ruby1.9
make
sudo make install
cd /usr/local/ruby1.9/bin/
./ruby -v
-> ruby 1.9.0 (2007-12-15 patchlevel 0) [i686-darwin9.1.0]

Rails 2.0.1 installation:

pwd
-> /usr/local/ruby1.9/bin
sudo ./gem install rails
-> Successfully installed actionpack-2.0.1
-> Successfully installed actionmailer-2.0.1
-> Successfully installed activeresource-2.0.1
-> Successfully installed rails-2.0.1
-> 4 gems installed
-> Installing ri documentation for actionpack-2.0.1...
-> Installing ri documentation for actionmailer-2.0.1...
-> Installing ri documentation for activeresource-2.0.1...
-> Installing RDoc documentation for actionpack-2.0.1...
-> Installing RDoc documentation for actionmailer-2.0.1...
-> Installing RDoc documentation for activeresource-2.0.1...
$

All installs nicely …

Attempting to run a Rails app (after install a few more requisite gems using above method):

$ cd /rails/my_1337_app/
$ /usr/local/ruby1.9/bin/ruby script/server
=> Booting WEBrick...
/usr/local/ruby1.9/lib/ruby/gems/1.9/gems/activerecord-2.0.1/lib/active_record/associations/association_proxy.rb:8: warning: undefining 'object_id' may cause serious problem
/usr/local/ruby1.9/lib/ruby/gems/1.9/gems/rails-2.0.1/lib/initializer.rb:224: warning: variable $KCODE is no longer effective; ignored
=> Rails application started on http://0.0.0.0:3000
=> Ctrl-C to shutdown server; call with --help for options
[2007-12-15 07:24:35] INFO  WEBrick 1.3.1
[2007-12-15 07:24:35] INFO  ruby 1.9.0 (2007-12-15) [i686-darwin9.1.0]
[2007-12-15 07:24:35] INFO  WEBrick::HTTPServer#start: pid=3386 port=3000
 
## I request http://0.0.0.0:3000/ ... 500 Internal Server Error
 
Error during failsafe response: can't convert Array into String
127.0.0.1 - - [15/Dec/2007:07:24:52 EST] "GET / HTTP/1.1" 500 60
- -> /

OK, it bombs out trying to actually process a request. But this error is really, really fast! I am actually serious saying that, the error *is* really fast.

Mongrel installation fails:

$ sudo ./gem install mongrel
Password:
Building native extensions.  This could take a while...
ERROR:  Error installing mongrel:
        ERROR: Failed to build gem native extension.
 
/usr/local/ruby1.9/bin/ruby extconf.rb install mongrel
creating Makefile
 
make
gcc -I. -I/usr/local/ruby1.9/include/ruby-1.9/i686-darwin9.1.0 -I/usr/local/ruby1.9/include/ruby-1.9 -I.  -fno-common -g -O2 -pipe -fno-common  -c fastthread.c
fastthread.c:13:20: error: intern.h: No such file or directory
fastthread.c:349: error: static declaration of ‘rb_mutex_locked_p’ follows non-static declaration
/usr/local/ruby1.9/include/ruby-1.9/ruby/intern.h:556: error: previous declaration of ‘rb_mutex_locked_p’ was here
fastthread.c:366: error: static declaration of ‘rb_mutex_try_lock’ follows non-static declaration
## etc etc etc

So webrick’s all we have for now.

You can track the state of edge Rails’ 1.9 readiness in this ticket on the Rails Trac. Plugins, though, will be another matter, although some fixes are pretty easy; an early failure I’ve seen is with plugins using File.exists?('file') which is of course deprecated in 1.9 in favour of the far, far superior File.exist?('file').

I like using the 1.9 console, though – it really does feel snappier, especially to load irb!

Migrations much improved in Rails 2

Friday, December 14th, 2007

I hadn’t been using migrations much in Rails 1.x. I just didn’t like the workflow – it was too clumsy, and I got annoyed at writing out the files. The workflow in 1.x was as such:

1.

script/generate migration MigrationName [--svn]

2. go and manually edit 00x_MigrationName.yml in /db/migrations to reflect your desired database changes, both up and down:

class AddCreatedAtToStaff < ActiveRecord::Migration
  def self.up
    add_column :staff, :created_at, :datetime
  end
 
  def self.down
    remove_column :staff, :created_at
  end
end

3. rake db:migrate to apply the changes to the local database.
4. svn commit and cap deploy:migrations to apply the changes to the remote database

Too long – especially step 2. I knew I should do it in principle, but CocoaMySQL is right there – especially if you make several changes in the space of a few hours. In theory it’s best practise – but in actual practise someone as lazy (and impatient) as me tends to just do it directly in the DB, then eternally put off that day where they’ll finally “move to migrations”.

It’s not even the laziness – it’s the “flow”. I’m doing other things, I’ve realised I need this field. I don’t want to stop what I’m doing and write out a damn YAML file! I want the field there, right now. Too demanding? Maybe, but like I said, the DB is right there, and the difference between a 5 second edit and 2 minutes writing the file is a lot of lost concentration.

And various other solutions, such as auto_migrations, seemed good but in practise are too flaky and a dangerous, unsupported road to take.

Enter Rails 2.0, and migrations are far, far better. The core Rails principle of “convention over configuration” is in full effect here, with excellent results.

Now the process of adding a migration is as such:

1.

script/generate migration add_created_at_to_staff created_at:datetime [--svn]

Note the convention at work here. You’re implicitly telling Rails the table to use, in this case “staff”, and the field you’re adding – in this case one of Rail’s magic “created_at” fields. You then explicitly write the fields out, which you’d have to do anyway in CocoaMySQL or similar, or manually at the mysql command line.

2. rake db:migrate to apply the changes to the local database.
3. svn commit and cap deploy:migrations to apply the changes to the remote database

That’s only one step less, but it was the biggest and most annoying one. The actual creation of the migration file and addition to svn is now a single brief, easy-to-remember line. This is now terse enough and convenient enough to enter my personal development workflow, a very welcome improvement to the framework, and an excellent demonstration of Rails’ core principles in action.

Resolve conflict between jQuery and Prototype

Thursday, December 13th, 2007

I had a strange error where Prototype and jQuery were overwriting each other’s namespace. Please don’t ask me why I happen to using both – it’s just on one page, and it’s just until I work out how to duplicate my Prototype effort in JQ (I find JavaScript utterly incomprehensible and incredibly difficult to learn).

Anyway, I tried a number of tricks with loading order and including various bits of each. I couldn’t keep it working – either I had one or the other, never both. Then I read this page: Using jQuery with Other Libraries on the official jQuery developer pages. It recommends you do as such:

   <script>
     jQuery.noConflict();
 
     // Use jQuery via jQuery(...)
     jQuery(document).ready(function(){
       jQuery("div").hide();
     });
 
     // Use Prototype with $(...), etc.
     $('someid').style.display = 'none';
   script>

But I don’t have any Prototype script in HEAD. It’s all addressed through the default Rails includes. I didn’t want to start messing with that.

I implemented the above, it didn’t work but the failure was different – now Prototype was trying to work, but erroring out with prototype TypeError: Value undefined (result of expression $) is not object.

Solution: On a lark, I removed jQuery.noConflict(); and renamed all jquery-targeted $(function) shortcuts to jQuery(function):

   <script>
     jQuery(document).ready(function(){
       jQuery("div").hide();
     });
   script>
 
// in body: Ajax.new & co Rails commands

It’s a horrible nasty hack that gets around the horrible nasty problems from using two JS frameworks on the same page. Don’t copy this, I’m posting it only for your morbid curiosity. If you dare copy this horrible technique your Good Programmer’s Card is revoked. But it works!

Rails? Slow?

Tuesday, December 11th, 2007

Rails? Slow?

75 seconds is nothing! Why, when I was young, we’d count ourselves lucky if a web server responded within the day…

In its defence, I’d been doing other things before requesting again .. enough time for some of it to go into swap. But man, 75 seconds!

Rails 2 – Too Much Magic?

Tuesday, December 4th, 2007

So I’ve been watching the RailsCasts episodes going through the latest features in Rails 2. I’ve read about them before – but it’s good to see them demonstrated by an expert.

But man – some of this is too much. Don’t get me wrong, I like magic – if I didn’t like magical free stuff I wouldn’t be using Rails at all. But there comes a point where there’s enough magic, and more just trades off too much for too little – diminishing returns, as it’s called.

Take this screencast: Simplify Views with Rails 2.0. It shows very clearly what I’m talking about.

Now some of this is good shit. The ability to write this:

<% div_for_product do %> # is this supposed to be <%= ?
<% end %>

And have it automatically generate div IDs and the like is great. That saves some nasty looking code – assuming it works properly, stays unique across multiple includes, etc. Good shit. A worthwhile addition.

But this:

# from
<%= render :partial => 'product', :collection =>  %>
# to
<%= render :partial =>  %>
# and for a single case
<%= render :partial =>  %>

I think this is bad news. Suppposedly Rails is going to look inside that instance variable, decide whether it’s an object or an array, and automatically use the correct partial, multiple times if it’s an array .. why? To save what, 15 keystrokes?

I would argue the “Rails 1.2″ way of doing this is about the most concise way to write this function imaginable while still maintaining decent at-a-glance understandability. It’s not so long and unweildy, is it? And you can see what’s happening instantly – ok, now render a bunch of partials named ‘this’ using whatever was in this . The word “collection” is a nice touch which helps you remember there’s a number of them.

The second one? Its behaviour literally changes according to what’s in that instance variable. Is it really that much nicer to look at that it’s worth losing the information necessary to see, at a glance and unambiguously, what it’s doing?

There’s more, and worse: forms, where we go from this

<% form_for :product, :url => product_path(), :html => {:method => 'put'} do |f| %>

to

<% form_for  do |f| %>

And if you’ve used the strict Rails conventions, mapped all your resources in routes.rb, are using the strict controller layout as demonstrated in the resources scaffold generator and have a picture of DHH as your desktop background, it’ll know what to do and work.

Now, at this point you might be saying “but you don’t have to use these features – it’s only if you follow the convention, which is the whole point of Rails! Follow the convention, get free benefits – it’s been like that since day 1″!

I agree with that, kind of. But there is such a thing as too much convention, and too many limits, and – this one’s my point – too much “magic”. I know it’s a bit silly to talk about “vendor lock-in” in the context of a Rails project but following “The Rails Way” is beginning to smell strangely similar. And these extraordinary efforts to save a few keystrokes, ONCE, at the expense of readability (and no doubt speed!) are beginning to seem less like cleverness and more like Koolaid-driven idealogy.

We’ve been told, again and again, that it’s all about “Beautiful Code”. And you know, I appreciate that up to a point. But when the code is getting less readable and more dependent on arbitrary chains of “convention dependence” which may or may not work (and which may or may not change in 3.0!!) then I start getting cold feet.

You know what? I’m working on *a* web site, not *the* web site like these changes all seemed designed for.

Predictions of a coming fork, anyone? “Rails Lite”?

Oh, and despite the endless hours which must have been poured into this kind of useless bloat, YAML handling in rails is still UTF8-unsafe.

UPDATE: Sure, you can just not use these features. But the codebase bloats, the execution speed is slower – and it’s not like it was fast to begin with. I argue these new features make the code more difficult to read and learn – hurting the Rails community as a whole, reducing numbers of developers, marginalising the framework – as someone with a considerable investment in that framework I am a stakeholder in Rails, and it’s from that position I comment.