June « 2008 « Sho Fukamachi Online

Archive for June, 2008

Safari trick

Monday, June 30th, 2008

I just learnt a new trick in Safari I haven’t seen anywhere else. Double click in the blank area of the “tabs” bar (ie, to the right of any existing tabs) to spawn and select a new tab.

Short and sweet.

Tags: safari, tips
Posted in lifestyle | 3 Comments »

N+1, where n > 2,000,000

Monday, June 30th, 2008

I unthinkingly executed an N+1 database operation on a table with in excess of 2 million records, doing a lookup on another table for every .. single .. one.

So let’s see .. I estimate it will need to do about 100k writes. So that’s 2*2,000,000 reads, then 100,00 writes .. a mere 4,100,000 accesses in total. Even better, it’s MySQL, and I don’t have the C adapter for that installed – so every access does a round trip through the treacle-slow pure Ruby adapter.

Oops. If it’s still going when I wake up tomorrow I might cancel it!

UPDATE: It was actually done in maybe 6 hours (wasn’t paying close attention), which I was pretty impressed by. I guess it’s not that much – MySQL probably had the whole thing in memory after a while and as I said, it was mostly reads. Still, 200 or so queries/second – not bad.

Tags: database, oops
Posted in lifestyle | No Comments »

More fun with ISO codes

Monday, June 30th, 2008

In the wake of the ISO-3166 split (CS to RS / ME) I was faced with the task of updating my “Countries” table when things randomly began failing. This brings back countless delightful memories of my original battle to construct a unified ISO-compliant countries/languages table from a while ago.

Well, that was easy enough – I didn’t bother with any re-imports, just added the new records by hand and all is well. But I thought I’d take the opportunity for some more fun by finally reforming my custom “Languages” table, in accordance with the upcoming updates to RFC4646 (I think – all the numbers are beginning to blend together). As long-time readers might remember, last time I tried to do this I gave up in frustration and wrote my own fricking language table.

Well, it’s good to see ISO hasn’t given up its core principles of crazy, archaic DB structures and weird, breaking proclamations of what they want to call languages this year. Take for example the language of Chinese Mandarin (Simplified). Now what do you think the ISO code for that is?

Is it zh_cn? zhs? zh_zhs? zh_man? zho_man? zh_Hans? All of them good guesses, all of them wrong.

The new standard name for Simplified Mandarin is zh_cmn. CMN? I have never even heard of that one, and certainly never seen it. A gets a stunning three (3) hits. Are you even supposed to preface with the 639-1 alpha2 any more? Oh I get it – it’s supposed to be zh-cmn. That gets 2300 hits on Google, for a language with close to a billion speakers.

What the hell? This is the standard now? I wish they’d just make up their minds, and that IETF and ISO would actually cooperate.

Combine this with the ISO “macrolanguage” system and you have a recipe for ever more hours of fun. Needless to say, ISO delivers the macrolanguages mapping table in an unusable, world’s worst practise TSV file with no primary keys and incomprehensible, capital-letter-including column names. Even the recommended table names, which include hyphens, are invalid in all modern DBs.

Oh, and ISO provides a downloadable DB for ISO-3166 1 and 2. It’s in MS Access format. Explains a lot.

So let’s sum up. We have ISO 3166 1, 2 and 3 .. and 4 and 5 and 6 which defines country codes. We have ISO 639n which defines languages … and some more country codes. We have RFC4646 which defines how browsers should use country and language codes. We have the Unicode mappings. We have more ISO codes, this time numeric. We have Timezones, which are different again.

What a fucking mess. And by the way, I have NO idea what Australian English is even supposed to be in the new ISO639-3. I don’t see it anywhere here. I can tell you what Australian Aborigines Sign Language is, though – asw. That will come in handy for all those sign language websites I’m working on as we speak.

Tags: hell, iso, rant, rfc
Posted in lifestyle | No Comments »

Ruby Versioning Is Completely Fucked Up

Monday, June 23rd, 2008

I have no idea what the Ruby guys are thinking with these version numbers.

They want to make a lot of changes, leading in to 2.0. Well, that’s fine. Great! Major version numbers are just the place to make sweeping changes, cleaning up a lot of stuff, progressing wherever they want. That is exactly the right thing to do.

Hang on .. they want to, uh, declare 1.9 their kind of “work branch” for 2.0. Uh .. right. What was wrong with using, say, a 2.0 beta branch? But .. hm, ok, well, it’s still OK, I guess, to make reasonable changes in a minor version .. kind of. Kind of negates the whole point of having the major version number in the first place, unless it’s just counting up in .1 increments, but whatever, they’re in charge. Personally I would have thought that 1.9 should be 1.8 on YARV, but whatever, it’s their ball game.

But 1.8.7? Incompatibilities from backported features from 1.9 – which is itself a kind of “future backport” of 2.0? What the fuck? Why the fucking hell are they backporting shit from 1.9 into 1.8 when it is not completely compatible?! Do they even understand the point of these version numbers?!

How Version Numbers Should Be Used

A Ruby version number: [major].[minor].[tiny]-p[patchlevel]

Major version number: do whatever the hell you want! Got breaking stuff and design changes? Here’s the place to do it!
Minor version number: For big new features and major changes, eg YARV. Generally backwards compatible, but .. well .. shouldn’t break stuff if you can help it but if you REALLY need to, OK .. kind of.
Tiny version number: for tweaks, fixes and optimisations only; ok to add (small) new features, should NOT break anything backwards
Patchlevel: absolutely should fucking not break anything backwards and to be perfectly honest should not even exist, and the fact that it does exist is a sign of bad release practises

They should not have made major (read: incompatible) changes in 1.8.7. They absolutely should NOT under any fucking circumstances have made incompatible changes in 1.8.6-p230!! What the hell is going on?

The Ruby team need to basically shift all their versioning one decimal point to the left. They are putting shit in patchlevel which should be in tiny – and there are WAY too many patchlevels. They should wait a bit and then roll up changes from trunk every few months, only levelling up when they’ve got something important or urgent. As it stands they’ve been going through one every few days, it’s ridiculous. May as well just use the checkin number. They are making changes in tiny which should be in minor. And for 1.9 they are using minor when they should be in major. Why the hell are they acting like this? It’s not like new numbers cost anything?!

As much as it pains me to say it, Ruby could learn from Rails here. Ditch the fucking patchlevel bullshit. 1.8.6 should equal 1.8.6, no questions asked, no possibility for difference, no fucking patchlevels or anything else. If you need a new version, 1.8.7, which is always backwards compatible. *Never* break backwards compatibility in a minor branch. And if you need more than 10 tiny versions in a minor branch, then just keep going – if 10.4.11 is good enough for Apple, then 1.8.11 is good enough for Ruby.

This security update should be 1.8.7. And what about the current 1.8.7? It should not even exist! What is the purpose of 1.8.7? To get us ready for 1.9? NO! 1.8.7 should be a straight progression, bugfixes and security updates for 1.8.6! The current 1.8.7 should be 1.9, which is getting us ready for 2.0!

The other day, I needed to install 1.8.6 on a new server. I installed the latest tarball of 1.8.6 from Ruby-lang, which was patchlevel 114. They have skipped over a hundred patchlevels in the last couple of days and now insist 230 is the minimum. Hint: if over a hundred of these patchlevels were not important enough to be bundled into the downloadable tarball, maybe they should not be patchlevels at all. Go look at ftp.ruby-lang.org. Look at their tarballs in 1.8 branch. There is *no* tarball between 114 and 230. So, uh, why isn’t p114 called 1.8.7 and p230 called 1.8.8?? Or 1.8.9, or whatever? What is the point of a patchlevel you can’t even download a tarball of?

Ruby versioning is completely fucked up.

Tags: ruby
Posted in lifestyle | No Comments »

Ruby compatibility debacle(s)

Sunday, June 22nd, 2008

Some people just can’t stop achieving! Not content with introducing breaking, backwards-incompatible changes into the Ruby 1.8 branch with 1.8.7, the Ruby Core team have shot for the moon, gone for gold, and doubled down all their bets – and made Ruby 1.8.6-p230 incompatible with Ruby 1.8.6-p(any other version). Of course, there is an URGENT!!!111ONE zero days exploit or something in any version of 1.8.6 that’s not the one released today – the incompatible one.

So let’s sum this up:

Ruby 1.9.0 = incompatible with Ruby 1.8.x
Ruby 1.8.7 = incompatible with Ruby 1.8.6
Ruby 1.8.6-p230 = incompatible with Ruby 1.8.6-p(<230)
Ruby 1.8.6-p(<230) = UR SERVER IZ OWNED LOL

Well that is just fucking great. Well, I guess if your app won't start, no one can hack it! Can't get more secure than that! Mission accomplished!

It seems like this Ruby Spec project is arriving not a fucking moment too soon. When the core team can’t even keep Ruby 1.8.6 compatible with, uh, Ruby 1.8.6, we got problems.

Tags: h4xx0rz, lulz, pwnage, ruby
Posted in lifestyle | 14 Comments »

France first to start seriously blocking net

Thursday, June 19th, 2008

Looks like France will be the first large first-world country to start seriously censoring the ‘net. What a depressing day that will be. Funny how I’m not the least bit concerned about terrorism – not the slightest little bit – but I am plenty worried about governments starting to filter the net at the country level. Time to open a VPN business?

  def pick_bogeyman
    bogeymen = [ 'terrorists' ]
    bogeymen << 'drug dealers'
    bogeymen << 'the mafia'
    bogeymen << 'child pornography'
    bogeymen.at(rand(4))
  end

Tags: censorship, stasi, wedge
Posted in lifestyle | No Comments »

Firefox 3 is better than Safari

Thursday, June 19th, 2008

The new Firefox 3 is superior to Safari in pretty much every way you care to name. It’s faster to load pages, more responsive, less prone to beachballs and more stable. Even better, it has a well-implemented plugins system which is invaluable for improving the web experience and development.

About the only criticisms I can make are that it’s uglier – I do prefer Safari’s look, which is more elegant and restrained. However, that in itself isn’t enough for me to put up with its slowness.

I also can’t understand why Mozilla don’t use Apple’s built-in localisation system; all Firefox downloads are locked to one language, forcing any users like myself who switch between accounts in different languages to maintain two different binaries – the only use I’ve ever found for the ~/Applications folder. Still, that’s not such a big deal.

Another feature from Safari I miss is the page size buttons you can place near the address bar, which I use constantly. ~~I also generally prefer Safari’s “bookmark bar” and use it for pretty much all my bookmarks.~~ (edit – you can now do this in FF3 too).

On the other hand, it is unfathomable that Apple has never implemented a decent plug-in architecture for Safari besides media handlers. All current plugins fall under the ignomious category of “unsupported hacks” and are not advisable to use. This is a great pity, since plugins can be truly wonderful – the experience of browsing the internet is improved so much by use of the Adblock plugin, for example, that this plugin is in itself almost enough of a reason to switch.

I have heard that the forthcoming Safari 3 is similarly much improved in terms of speed and responsiveness – not to mention implementation of new features in HTML and CSS. Let’s hope the release of FF3 stimulates Apple to get Safari 3 out the door as soon as possible – not their usual trick of tying it to an OS release, the nearest candidate being at least 6 months away and more likely 12.

I still use Safari, but I’m launching FF more and more – usually prompted by the latest Safari beachball or crash. I try to use the standard OS software where possible, generally, but when something else is so clearly superior it’s hard to justify continuing with the incumbent – I certainly don’t understand why so many windows users stick with IE, for example. The fact that I’ve delegated all my RSS needs to Mail.app further decreases any lock-in to Safari

Tags: browsers, firefox, safari
Posted in lifestyle | 2 Comments »

Sorry Lonely

Thursday, June 19th, 2008

My feelings about the KCO album O-Crazy Luv changed dramatically, and now I love it. Favourite track of the moment is Sorry Lonely but most of them are good. Anyway that’s my “song of the day”.

What a sleeper of an album. I hated it at first – especially after listing to “mobile emotion” first, and only the first part .. which is admittedly pretty hard to swallow even when I like the rest of the album. But it really grew on me, and now seems basically like an unofficial globe album, sans le rappeur français.

And, I managed to find the lyrics, after the break. Entire album’s lyrics are here.
(more…)

Tags: music, songoftheday, songoftheweek, sotw
Posted in lifestyle | No Comments »

ActiveRecord to DataMapper/CouchDB class header script

Thursday, June 12th, 2008

I hacked together this script which dumps the structure of your database into a DataMapper-compatible class header file. You’ll need these for your Models if you’re planning on playing around with datamapper – should save you a bit of typing.

There’s a bit of CouchDB boilerplate in there, feel free to strip that out if you’re not using that. The large sql_type case statement just covered what I had, which was mostly postgres and will need modification for MySQL – but it will tell you what you need to change, should be obvious enough. Note that I am coercing pretty much everything to basic Integer/String – this is for CouchDB but you can change it to match the types you want. And yes, I know it’s ugly!

# datamapper_dump.rake
# Sho Fukamachi 2008
# place in lib/tasks and run with: rake db_to_dm
 
def tables_we_want
  skip_tables = ["schema_info", "sessions"] # this is slow enough as is without sessions
  ActiveRecord::Base.establish_connection
  #ActiveRecord::Base.connection.schema_search_path = "path1, path2" - uncomment for PGSQL
  ActiveRecord::Base.connection.tables - skip_tables
end
 
task :db_to_dm => :environment do
 
  sql  = "SELECT * FROM %s"
  dir = RAILS_ROOT + '/db/dm_temp/'
  FileUtils.mkdir_p(dir)
  FileUtils.chdir(dir)
 
  ActiveRecord::Base.establish_connection
  #ActiveRecord::Base.connection.schema_search_path = "path1, path2" - uncomment for PGSQL
 
  puts "Dumping Schema into DM format..."
 
  File.open("models.rb", "w+") do |file|
    file.write "# Models dump File for DataMapper/CouchDB\n"
    file.write "# Sho Fukamachi 2008\n"
    file.write "\n"
    tables_we_want.each do |table_name|
    class_header = <<-EOF
class #{table_name.singularize.capitalize}
 
  include DataMapper::Resource
 
  def self.default_repository_name
    :couchdb
  end
 
  # required for CouchDB
  property :id, String, :key => true, :field => :_id
  property :rev, String, :field => :_rev
 
  # regular properties
EOF
    file.write class_header
    ActiveRecord::Base.connection.columns(table_name).each do |c|
      if !c.sql_type.scan('integer').empty?
        file.write "  property :#{c.name}, Integer\n"
      elsif !c.sql_type.scan('character').empty?
        file.write "  property :#{c.name}, String\n"
      elsif !c.sql_type.scan('text').empty?
        file.write "  property :#{c.name}, Text\n"
      elsif !c.sql_type.scan('datetime').empty?
        file.write "  property :#{c.name}, DateTime\n"
      elsif !c.sql_type.scan('uuid').empty?
        file.write "  property :#{c.name}, String\n"
      elsif !c.sql_type.scan('boolean').empty?
        file.write "  property :#{c.name}, Integer\n"
      elsif !c.sql_type.scan('double precision').empty?
        file.write "  property :#{c.name}, Integer\n"
      elsif !c.sql_type.scan('bigint').empty?
        file.write "  property :#{c.name}, Integer\n"
      elsif !c.sql_type.scan('smallint').empty?
        file.write "  property :#{c.name}, Integer\n"
      elsif !c.sql_type.scan('date').empty?
        file.write "  property :#{c.name}, Date\n"
      elsif !c.sql_type.scan('timestamp without time zone').empty?
        file.write "  property :#{c.name}, DateTime\n"
      elsif !c.sql_type.scan('time without time zone').empty?
        file.write "  property :#{c.name}, Time\n"
      elsif !c.sql_type.scan('bytea').empty?
        puts "INVALID: column #{c.name} in #{table_name.capitalize}: binary data is not allowed here."
        puts 'You will need to rethink your schema for use in CouchDB. Hint: Attachments.'
        puts 'SKIPPING!'
      else
        file.write "COLUMN: #{c.name}, UNKNOWN DATA TYPE #{c.sql_type} - CHANGE ME!\n"
      end
    end
    file.write "end\n"
    file.write "\n"
    end
  end
  puts "dump succeeded. Look in db/dm_temp/models.rb"
end

Tags: couchdb, rake, ruby
Posted in lifestyle | No Comments »

Faking 10.4U SDK for MacPorts

Thursday, June 12th, 2008

Since 10.5.3 a lot of software installed with macports has broken. One way to get it working again can be to reinstall with the option +universal, eg:

sudo port install erlang +universal

Unfortunately, if, like me, when you installed the Developer tools on 10.5, you unchecked the 10.4U SDK option because you didn’t think you needed it, you might see something like this:

$ sudo port install erlang +universal
Error: Error executing universal: MacOS X 10.4 universal SDK is not installed (are we running on 10.3? did you forget to install it?) and building with +universal will very likely fail
Error: Unable to open port: Error evaluating variants

The good news is you don’t need to go find your disk, you don’t actually need the 10.4 SDK if you have the 10.5 one installed .. and you might not even need that. But it’s hardwired to check for it, so fake it out like this:

$ sudo touch /Developer/SDKs/MacOSX10.4u.sdk

Now it will work.

$ sudo port install erlang +universal
--->  Fetching erlang
--->  Verifying checksum(s) for erlang
--->  Extracting erlang
--->  Applying patches to erlang
--->  Configuring erlang
--->  Building erlang with target all
--->  Staging erlang into destroot
--->  Installing erlang R12B-2_1+universal
--->  Activating erlang R12B-2_1+universal
Error: Target org.macports.activate returned: Image error: Another version of this port (erlang @R12B-2_0) is already active.
Error: Status 1 encountered during processing.
$ sudo port deactivate erlang @R12B-2_0
Password:
--->  Deactivating erlang R12B-2_0
$ sudo port activate erlang @R12B-2_1+universal
--->  Activating erlang R12B-2_1+universal

God knows why it wants +universal but hey, works.

Tags: erlang, leopard, macports
Posted in lifestyle | 3 Comments »

Rails sessions breaking in Ruby 1.8.7

Wednesday, June 11th, 2008

Installed the new “stable, recommended” version of Ruby as linked from ruby-lang.org, 1.8.7, and finding your Rails sessions breaking?

You’ll get an error like this:

/!\ FAILSAFE /!\  Tue Jun 10 15:09:34 -0500 2008
  Status: 500 Internal Server Error
  wrong number of arguments (2 for 1)
    /usr/local/lib/ruby/1.8/cgi/session.rb:267:in `respond_to?'

This will be fixed in Ruby 1.8.7-2 but for now install this patch and restart the server.

I also had some enumerator errors, post a comment if you want a fix for that too.

Tags: ruby
Posted in lifestyle | 8 Comments »

Downtime, Upgrade

Wednesday, June 11th, 2008

Apologies for the downtime – I took it upon myself to upgrade this server to RHEL5.

The upgrade went pretty smoothly, taking about 6 hours – although that includes a lot of time copying files around. Some sites on this server aren’t back up yet, as I’m taking the opportunity to modernise a few bits of infrastructure here and there. Also, the media folder is completely missing for now – that will take a long time to copy back from my pathetic DSL.

My main reason for upgrading was difficulties I’d had installing various pieces of software – mostly pretty new stuff. Anyone who’s glanced at the CouchDB mailing list in the last few months will likely remember (with a groan) my epic struggle to get the thing working reliably on RHEL4 – that was a lot of hours splashed to the winds there. I’m delighted to be able to report that installation on this clean, modern server went without a hitch. I consider the time it’s taken me to reinstall “paid for” with that development alone.

Other changes include moving to nginx for Rails hosting, decommissioning SVN once and for all, moving to a new and more logical folder structure, revamping users and permissions, a coming public git repository, and other things I’d been wanting to do for a while but didn’t get round to because of the previous mess.

I’ll be renewing my efforts to hew strictly to RPM-installed packages on this OS. Some source installations are inevitable – Ruby and Erlang, for example, are quite old on the default system/package repo so they have to be installed from source. But everything else I’ll be endeavouring to keep as clean as possible for as long as possible – I want absolutely no repeat of the fucking debacle the js/spidermonkey CouchDB prerequisite install turned into.

In-place upgrades are not my usual cup of tea – I’ve made an exception this time because the hardware is still pretty decent, there were no immediately obvious gains in value to be had upgrading that, and I literally couldn’t do what I wanted to with the previous OS. A hardware upgrade is on the cards, however, and in anticipation of that I’ve kept my most detailed install notes ever – literally every command. What took me 6 or so hours (and counting) should take 1 or 2 next time.

Onwards and upwards, then ..

Tags: rhel5, site
Posted in lifestyle | 12 Comments »

The revolution is not over, it hasn’t even begun

Monday, June 9th, 2008

Note: This post was originally a comment on this article at Information Architects Japan. Unforgivably, their website ate my comment – even more so given the supposed expertise in website design by said company. Luckily, suspecting incompetence, I copied it before posting, and now present it below in an edited form.

THE REVOLUTION IS NOT OVER

“We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.” – Roy Amara (1925-2007)

This is a point-by-point refutation of this article. Text from his article is in blockquotes, my response follows.

The IT-Revolution promised to free and enrich us. To free us from propaganda, to free us from mindless TV, to free us from advertisement torture, and to enrich us by letting machines do all the boring work so we’d have more free time. So, how did it go?

It “promised” no such thing, and is not even a single entity. Human pundits, motivated by greed or the need to provide inspiring sound-bites, may have promised something similar, but the IT “revolution” has no inherit promise. It may well hold the potential to enable some or all of those points, however, as it develops. “How did it go?” is laughably premature, a theme I will expand on below.

Good Internet Revolution

1. We read and write more today than we used to.

We? Who is “we”? I do, yes, and the author of the article might as well. But the vast majority of users do not – and if they do, it is mostly low-quality “chat” which if anything is a replacement of the telephone calls they might have been on prior to having net access.

2. The public opening of digital publication technology (AKA “blogs”) has provided a free speech transport with rocket engines.

Blogs are nothing but home pages with a bit of automation. They remain laughably primitive for the most part, hopelessly inflexible, and mostly look alike (this one included).

But “free speech”? What is that supposed to mean? Zero cost? Indeed, blogs have provided many with the means to say what they think for little or no cost in money or effort, although the vast majority are deservedly obscure. If the meaning was free as in freedom, however, blogs don’t add much to that.

3. News has become more accessible and more transparent.

News about what? Accessible how? Transparency?

Most “news” about the world continues to be sourced from traditional providers, for the simple reason that it costs a lot to provide. Almost all areas in which blogs or other personal publishing efforts could be said to have made a difference are areas in which “news” is easy to gather – be it filtering output from mainstream publications or “insider” reports from a specific industry.

Accessible? Reliable or even well-written news is almost impossible to find outside of traditional sources, single-issue blogs/aggregator services notwithstanding. Traditional news sources have indeed become easier and cheaper to access, however.

I cannot see how the advent of the internet has had any effect on “transparency” of news providers. Indeed, if anything, there are less rules.

4. The Internet is the taser against the shit bags that try to manipulate, embellish, and block information that is inconvenient to them.

Is it? How? Sure, if someone tries to cover something up, say, a politician, there’ll be a few blogs who report the truth, and word might get out. But only for the people who are specifically looking, and half the time it’s untruth that makes it into the blog echo chamber.

And traditional news sources have ever performed this function, regardless. How are internet sources any better?

We have wikileaks and mac rumors sites. People who go to them are already searching for the truth. There were publications before the net which performed similar functions .. and most “exposés” of any quality remain the work of traditional outputs.

I agree though that sites like wikileaks et al and “anonymous tips” are facilitated by the internet and may indeed turn out to be very valuable as the technology and public awareness develops. I’m personally very interested in this phenomenon.

5. We can now literally X-ray politicians before we vote for them.

Uh, that word. I do not think it means what you think it means.

However, it’s wrong anyway. I am not aware of any existing site which presents information on politicians and their history in anything like the advanced format I would want.

Bad Internet Revolution

1. But now we have more junk data and less free time.
2. We have more tasks in our inbox and less concentration to complete them.
3. We are bombarded with even more idiotic advertisement (spam)
4. We write (short notices), and read (bee-beep-bee-beep) even more fast-food data.
5. The corporations have become even more shrewd (via viral campaigns, paid comments, and “Social Media Consultants”).
6. We outperform each other blogging, twittering, tumblering, and Facebooking.
7. We mindlessly link our friends to the dumb-ass websites where spammers and stalkers, grudgers and psychos, and old, finally-forgotten nags – dressed and masked as virtual vampires – wait behind some wonderwall.

Is it just me, or are almost every single one of those points a matter of personal choice and time management on behalf of the user? And furthermore, aren’t the sites you mentioned all startups?

A user’s management of his time and attention is up to them, as it has always been. You complain about being overloaded with data – but how is that data even getting to you, if you didn’t request it in the first place? The internet is pull, not push. If you don’t like it, stop requesting it!

To be clear: We are not free. “Fast-food data junkies,” that’s what we are. How did we get into this mess?

Whether you are “free” or not is completely irrelevant. If you choose to be a data junkie, then a data junkie you will be – and can join the TV junkies, talkback radio junkies, and all the other people who can’t manage their time properly. What are you expecting – someone to stop you?

I don’t have a facebook account, I don’t use my twitter account, I don’t have a “tumblelog”, the reason being I can’t see the point of any of them. I do have a blog, which I post on when I feel like it – and if I think I’m writing on it too much, I – get this – stop.

If you find yourself wasting too much time on frivolous services, stop wasting so much time on those frivolous services! What’s the problem?!

Revolutions are vicious circles. Remember, after the French got rid of their sleepy King Louis XVI, they installed the radical Robespierre, followed by the brutal tyrant Napoleon Bonaparte I. Now what happened to us exactly?

It seems what happened to “us” is that we started writing facially ridiculous analogies to French history on our blogs?

Seriously, that analogy is so off the mark it’s not even wrong.

Ironically, people still believe that the Internet belongs to them, some journalists behind the times even complain about “the mob reigning the web.”

No, what’s ironic is that you’re posting that on your own website, and I’m replying on mine.

Truth is, the World Wide Web is in the hands of a few Emperors – namely Google, Yahoo! and Microsoft – that split the territory amongst themselves quite some time ago.

Well, I can’t deny that Google holds an extremely powerful position at the search gateway. However, the relevance of MS and Yahoo has been fading for a long time.

But “split the territory up”? Huh? They compete on many common grounds. But the internet is not “territory”, unless you’re talking about IP and domain allocation, and I doubt the “big three” combined hold even a thousandth of that.

Nowadays, building up a web service and making money outside the Territory of the Three Web Caesars is considerably more difficult than just starting a “real” shop.

A ridiculous statement with literally millions of counterexamples.

Large organisations have extremely powerful momentum and network effects, both on and offline. You say it would be easier to start a real shop – what are you smoking? The type of success you’re judging all those startups by is equivalent in the retail world of, say, toppling Mitsukoshii or Family Mart. Good luck with your hole in the wall putting them out of business anytime soon.

Start Up Fata Morgana

If you look at the success story of startups that made it (like Youtube for instance), you’ll realize that the dream of the cool website, that simply offers good information while finding users and making money, is a Fata Morgana that drives thousands of young enthusiasts into death of thirst. You need connections and loads of money to make it in the world of the three titans.

The success of Youtube proves that making a new website is a bad idea? Yeah, USD$1.65 billion worth of bad idea.

There are plenty of “startup” websites, and most of them do fail. But if a website is actually cool, and does actually have good information, then usually I would say they succeed, eventually, although how much money they make can be a volume game.

What’s an example of a cool website with good information that’s gone under? Chances are it wasn’t cool at all, the information was not good, most likely both, and was rejected for those reasons.

Having connections and money helps, of course, as it does in any field you care to name. I would actually say that on the internet the playing field is dramatically levelled such that anyone who really does have a good site, and a steady stream of actual good information, will almost certainly be discovered and become popular. The problem is that most sites fail dismally to provide either.

Good information, or “content” as they call it, is, and always has been, expensive to produce. Expensive for you to write yourself in terms of time and opportunity cost, or expensive for you to compensate others to do for you – good writers are very expensive. What did you expect? That the internet would provide that content to you for free?

We know one thing for sure. The Revolution is over; the people have nothing to say in the Napoleonic era of the web.

Such a ludicrous statement I can’t even be bothered mocking it.

So what did the Revolution bring us in the end?

Uh, your website upon which you’re writing this? Your audience? Your company’s entire raison d’être?

[cut more about the french revolution]

We the people now have the option to become data connessieurs (we didn’t have that before the IT revolution!). The offer of delicious information nowadays is huge indeed. All one must do is choose. And that means reduce: trim your E-mail accounts down to one. Chop the Facebook annoyance. Peel your Linked-in account. Fry your Twitter profile. Freeze your cellphone. Bon appétit!

Right, so it seems you did know the solution to your time management problem after all. If some service is costing you more in time and attention than the value you’re getting from it, then stop using it. It’s not rocket science.

But I think the subtext here is something different – “the internet has failed, we should stop using it” – and that’s rubbish.

The revolution of a common data network has not even properly begun. Common protocols are only just being invented, let alone used. Almost no appliances or other objects have any connectivity. The vast majority of discussion is by instant message or primitive forums. There are no good newsfinder services, although recent inroads show promise. There are no common clearinghouses for news, no common geographic systems, no easy way to interface with the ones that exist. Video and even audio streaming remains an expensive niche. Collaboration software is in its infancy, but early versions show promise (wikipedia). There is no common tagging or ISBN-style library system for internet works. There is no easy way to use a remote filesystem from a desktop computer, or act as a server. There are very few standard data formats. There is no easy way to find ad hoc statistics, or transform the data even if you can find it. There are no good automatic translation systems. I could go on, and on, and on.

And you say the revolution is over?! I’m not even going to say it’s the end of the beginning! It’s still the beginning of the beginning! It’s the foreword!

The main thing I detect in this essay is disillusionment and/or sour grapes, which seems a little strange coming from someone whose line of work is supposed to be internet-related consulting. There are plenty of people working to make the internet better – if you want to overthrow the corporations, why not help them instead of declaring a fait accompli defeat?

Anyone can whine uselessly about the state of things, maybe because it’s hardly any work and there’s no risk besides your reputation – and you’ll always find some fellow downtrodden battlers to chime in with sympathy. But it’s the wrong path to take in the long run.

I don’t see any grand experiments the author has tried and failed. What have Information Architects done to help progress the state of information online? This site? All I see is a lightly modified WordPress install. A company selling its expertise of “information architecture”’s idea of the perfect information architecture is, uh, WordPress. Right.

Not trying to criticise personally here but if you want to declare defeat you need to at least be in the fight, and I don’t see any evidence of that. If you’ve got some war stories, tell them. If you’ve got some theories on how to build better data structures – I’m all ears. But this defeatist junk? Keep it to yourself please, lest you lose the few remaining visitors you still have from when you wrote interesting articles worth considering.

Oh, and one more thing. On your front page right now? EIGHT articles about your web trend map, your superficially interesting map of all those corporate properties you’re railing against here, but which you have made way too big a deal about, one effort-free superficial “remix” of (corporate) apple’s iPhone, and this article. Is that your recipe for global success?

My blog sucks but it’s got a lot more about Information Architecture on its front page right now than Information Architects Japan.

UPDATE:

The cowards can’t even bring themselves to unmoderate my comment. This from an “internet company”. They delete outright my first (highly critical, copied above) comment – and then moderated my comment complaining about that?

What kind of 12 year old is running this company?

awaiting moderation

Let me quote some article I read recently:

4. The Internet is the taser against the shit bags that try to manipulate, embellish, and block information that is inconvenient to them.

Who’s the taser, and who’s the shit bag blocking inconvenient information, now?

Tags: information architecture, rant
Posted in lifestyle | No Comments »

Geohashing using matrices in Ruby

Monday, June 9th, 2008

After reading this article about geohashing, I thought I’d share an alternative implementation I have which uses a matrix to translate coordinates to and from a string.

People are generally scared of geolocation, but that’s because they’re overthinking it. Forget all that stuff about mapping onto a sphere – a set of coordinates are just a point on a 2D plane with dimensions 360*360. The fact that that plane is then mapped onto a sphere (well, a spheroid) is something that 99.9% of developers don’t have to worry about. All you care about most of the time is whether location X is near location Y.

So, since we have these vectors on a plane of known bounds, if we want to represent some coordinates as a single string, all we have to do is map them onto a very simple matrix.

Happily, our plane is of dimensions 360*360 – and we have 36 unique characters to play with! So what we’ll do is divide the plane into squares, then sucessively divide out the coordinates as we increase resolution on the plane. That way we reduce the vector into a string, keep the resolution, and retain the ability to compare to other nearby locations. We can also, of course, extract them out again later.

Encoding

Let’s start with the point 39.286534 -76.613558, like the article linked above.

 = [39.286534, -76.613558]

First, we need to normalise the numbers. Add 180 to both

 = 
.map! {|e| e =  e + 180}
=> [219.286534, 103.386442]

Now, our matrix. We have 36 characters in base36, so that is 36 squares we can have, 6 on each side. So we will start cutting the number at 60 degrees intervals.

First we need a matrix. We are just going to reuse this again and again, so we only need one. Dimensions 6*6 for a total of 36 squares.

require 'matrix'
 = Matrix.rows([
["0", "1", "2", "3", "4", "5"], 
["6", "7", "8", "9", "A", "B"], 
["C", "D", "E", "F", "G", "H"], 
["I", "J", "K", "L", "M", "N"], 
["O", "P", "Q", "R", "S", "T"], 
["U", "V", "W", "X", "Y", "Z"]
])

I had some code that generated that, but I can’t find it .. so just declare it manually. You’ll need to require ‘matrix’. If we were really serious we would do things a little different but we’re focussing on simplicity here and don’t want to start messing around with origins, etc.

>> [0,0]
=> "0"
>> [0,1]
=> "1"
>> [1,1]
=> "7"
>> [2,1]
=> "D"
>> [5,5]
=> "Z"

looking good.

First slice

So! The first cut.

slice = 60.0
first = .map {|e| (e / slice).to_i}
=> [3, 1]

Note we will cast to integer to get rid of the remainder, which we don’t care about. All we want is to find the first digit.

which is …

code = [first[0],first[1]]
=> "J"

Hooray! Our first character. Note that Ruby’s matrix class is a bit strange and won’t accept the straight array, you need to give two arguments .. not very nice but that’s the way it is.

Now, onto our next step .. but we’re forgetting something! Can you see what it is?

We need to reduce the coordinates in preparation for the next cut. But how to do that without affecting the digits afterwards?

Easy – with modulo!


=> [219.286534, 103.386442]
.map! {|e| e =  e % slice}
=> [39.286534, 43.386442]

Note the decimals are unchanged. We’ve simply removed the multiples of 60 we extracted for the first character of the code.

2nd slice

We have now placed our coordinates inside a 60*60 square. We have a 6*6 matrix to describe the next step. So, slice size is .. 10 degrees.

slice = 10.0
 
def process
  result = .map {|e| (e / slice).to_i}
  code << [result[0],result[1]]
  .map! {|e| e =  e % slice}
end
 
=> "JM"
=> [9.286534, 3.386442]

3rd slice

We are now in a 10*10 box, and are faced with a choice. Shall we continued with the 6-themed divisions, or go to 5? It’s up to you. If you choose 6, slicing will be easier, but the degree cuts will be horrible decimals. If you go to five, your slices become worth more after the 2nd character, your slice code will be more complex .. but the resolution of the slices will look nicer.

I’m going to go with 6 anyway.

slice = slice / 6 = 1.66666666666667
 
=> "JMW"
=> [0.953200666666684, 0.0531086666666689]

Hm. We have another problem. Our divisions are adding significant digits to our points. That’s bad, since if we continue our result will be more accurate – wrongly – than the data we fed in. In fact, it will be infinitely accurate, erroneously!

We need to measure the significant digits of our input points.

back to the original:

 = [219.286534, 103.386442]
 
sig_digits = .map {|e| e = e.abs.to_s.sub(/^.{1,3}\W/, '').size}.sort.last
=> 6
limit = 1.0/10 ** sig_digits

Hm. That’s not quite right, but will do for now. We will terminate when the slice size goes below that.

It’s getting tiresome typing all this out. Let’s combine it into a function:

def geocode(point_array)
  sig_digits = point_array.map {|e| e = e.abs.to_s.sub(/^.{1,3}\W/, '').size}.sort.last
   = 1.0/10 ** sig_digits
   = 360.0
   = ''
   = point_array
  .map! {|e| e =  e + 180}
  while  > 
     =  / 6
    result = .map {|e| (e / ).to_i}
     << [result[0],result[1]] if !result.include?(nil)
    .map! {|e| e =  e % }
  end
  return 
end
 
geocode([39.286534, -76.613558])
=> JMWIDIN7AM1

Aww. Our first geocode. How about trying something with less resolution?

geocode([39.2, -76.6])
=> JMWI1

Cool. As you can see, we keep the same code for as many characters we have resolution for.

Decoding

Now how about extraction?

I’ll just paste it in:

def extract(geocode)
  sig_digits = geocode.length
   = 360.0
   = [0.0,0.0]
  working = geocode.split(//)
  working.each do |c|
     =  / 6
    int = c.to_i(36)
    matcol = int % 6
    matrow = (int / 6).to_i
    [0] += matrow * 
    [1] += matcol * 
  end
  .map! do |r|
    left = r.to_s.split('.')[0].size + 1 # 1 extra for decimal point ...
    local_sigdigits = sig_digits - left
    r.round(local_sigdigits)
  end
  .map! {|e| e =  e - 180}
  return 
end
 
class Float # thanks Rails
  alias_method :round_without_precision, :round 
    def round(precision = nil) 
      precision = precision.to_i 
      precision > 0 ? (self*(10**precision)).round/(10**precision).to_f : round_without_precision 
    end 
end

Notice we don’t even have to touch the matrix getting the numbers back out. Since the rows and columns correspond to known multiples in a known order, we can just get the multiples out by dividing/modulo’ing by 6. Cool, huh?

What’s not so cool is my significant digit estimation there, which is, uh, wrong. I’ll fix it later.

So, does it work?

geo1 = [39.286534, -76.613558]
puts 'point 1: ' + geo1.inspect
code1 =  geocode(geo1)
puts 'code 1: ' + code1
excode1 = extract code1
puts 'extracted point 1: ' + excode1.inspect
puts
geo2 = [39.2, -76.6]
puts 'point 2: ' + geo2.inspect
code2 = geocode(geo2)
puts  'code 2: ' + code2
excode2 = extract code2
puts 'extracted point 2: ' + excode2.inspect
 
point 1: [39.286534, -76.613558]
code 1: JMWIDIN7AM1
extracted point 1: [39.2865334, -76.6135583]
 
point 2: [39.2, -76.6]
code 2: JMWI1
extracted point 2: [39.2, -76.6]

Cool. I need to fix that significant digit error, but it’s almost 4am and I can’t seem to get my brain around it right now Any suggestions for proper estimates welcome.

So what use can we make of a hashed set of coordinates, apart from being able to put them in a URL? Well, it’s now simple to estimate proximity.

One degree of latitude/longitude is equal to about 111km. So, if two places have the same first letter, they’re within 60*111km of each other. Second letter is 10*111km, 3rd is 1.67*111km, and so on. Unfortunately, that only works up to a point – places can be “next to each other” laterally but skip up 6 if they’re “next to” each other longitudinally. To get around that problem, we need to move to a 3D matrix – I have an implementation, but this post is already too long and 3D matrices are pretty hard to talk about. Anyway, with this system you can determine at a glance if two places are in roughly the same place, with a decent estimate of their general proximity.

It is definitely possible to write a function that will return the approximate distance between any two geohashed locations, but I will leave that as an exercise for the reader for now. Hint: look up the matrix coordinates using the divide/modulo trick from the extract function. The distance is just a vector * slice * km away.

Anyway, there you have it: my method of geohashing in Ruby. Totally incompatible with the first method, and I’ve probably made some horrible mistakes – but I like its method of operation; every successive character is directly mapped with increasing resolution on a consistent plane. I like the concinnity of the concept, hope you do too!

Tags: geohash, ruby
Posted in lifestyle | 1 Comment »

No Django release since March 2007

Saturday, June 7th, 2008

The most popularly suggested “alternative” to Ruby on Rails, Django, has not seen a full tagged release since March 2007, 0.96. This is pretty, uh, bad – how are you supposed to establish a baseline compatibility for your app? Refer to the subversion number?

I evaluated Django a couple of times 12-18 months ago, but ended up staying with Rails. Now I’m glad I did. There are very good reasons for picking some targets and doing a “release” every so often, and some equally good ones for not simply telling everyone to work out of trunk. There’s no excuse for it – subversion may not be as flexible as Git but tagging a release is simplicity itself. Hell, what is stopping them doing it right now? Even if there’s no good roadmap or release plan, anything is better than nothing. It’s a single command!

I’ve been critical of Rails Core’s professionalism in the past, but Django are making them look like calm, super-organised air traffic controllers compared to “tagged releases? who needs them just check out trunk LOL”.

What a mess.

Tags: django
Posted in lifestyle | No Comments »

Rails now generally running on Ruby 1.9.0

Friday, June 6th, 2008

Amid all the RailsConf hype about Rails 2.1.0 and Maglev, not much attention seems to have been given to the fact that Rails is now generally, mostly runnable on Ruby 1.9.0.

Rails running on Ruby 1.9.0

Here’s the Ruby-Core message confirming the news. Not all tests are currently passing – almost all of them related to the TZTIME functionality introduced in 2.1.0, and needless to say mongrel still doesn’t work. Nor do the postgres or mysql gems, for that matter, although sqlite3 seems good to go – you ain’t gonna be switching to 1.9 anytime soon on your production site.

However, it runs, and if it runs, we can benchmark it!

I’ll use my /api/pulse controller action from previous testing. This is going to be Webrick-only, thin installs and seems to work until I load a page and then bombs out. Anyway.

The only modifications to the default rails application I am making here are to change the shebang line in script/server to point to 1.9, and to remove a couple of lines in boot.rb which check for RubyGems versions but bomb out for some reason in 1.9. Everything else is clean and Rails is installed as a gem under the 1.9.0 tree.

Webrick still has some problems under 1.9.0. Running with too high a concurrency seems to freak it out on 1.9.0, so I’ve turned the concurrency down to 5. Also, 1.9.0 is handily faster than 1.8.6 in development mode .. but in production mode, uh, you’ll see.

Ruby 1.8.6:

$ ab -n 500 -c 5 http://0.0.0.0:3000/api/pulse
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/
 
Benchmarking 0.0.0.0 (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Finished 500 requests
 
 
Server Software:        WEBrick/1.3.1
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /api/pulse
Document Length:        2 bytes
 
Concurrency Level:      5
Time taken for tests:   2.67175 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Total transferred:      139000 bytes
HTML transferred:       1000 bytes
Requests per second:    241.88 [#/sec] (mean)
Time per request:       20.672 [ms] (mean)
Time per request:       4.134 [ms] (mean, across all concurrent requests)
Transfer rate:          65.31 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:     9   20  13.1     17      85
Waiting:        7   17  12.5     15      84
Total:          9   20  13.1     17      85
 
Percentage of the requests served within a certain time (ms)
  50%     17
  66%     18
  75%     18
  80%     19
  90%     20
  95%     25
  98%     84
  99%     84
 100%     85 (longest request)

Ruby 1.9.0:

$ ab -n 500 -c 5 http://0.0.0.0:3000/api/pulse
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/
 
Benchmarking 0.0.0.0 (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Finished 500 requests
 
 
Server Software:        WEBrick/1.3.1
Server Hostname:        0.0.0.0
Server Port:            3000
 
Document Path:          /api/pulse
Document Length:        2 bytes
 
Concurrency Level:      5
Time taken for tests:   3.921955 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Total transferred:      139000 bytes
HTML transferred:       1000 bytes
Requests per second:    127.49 [#/sec] (mean)
Time per request:       39.220 [ms] (mean)
Time per request:       7.844 [ms] (mean, across all concurrent requests)
Transfer rate:          34.42 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:    21   38  10.5     36      83
Waiting:       13   24   9.4     22      68
Total:         21   38  10.5     36      83
 
Percentage of the requests served within a certain time (ms)
  50%     36
  66%     38
  75%     39
  80%     40
  90%     46
  95%     69
  98%     73
  99%     74
 100%     83 (longest request)

Well, I said it runs, I didn’t say it runs well. Seems to be a problem with Webrick, probably related to the concurrency issue. By comparison, in development mode the numbers are about 42reqs/sec (1.9.0) vs around 33reqs/sec(1.8.6). Still – giant leap from even a couple of months ago, and I look forward to further leaps in the near future.

The next step is to get a real web server running. This will probably be Thin, once I figure out how to get around the , I know that it should work, others are already using it on 1.9, though not with Rails. Any ideas welcome!

UPDATED to try and workaround concurrency issues with Webrick under 1.9.0
UPDATE 2: Duh, I was running on development.

Tags: rails, ruby
Posted in lifestyle | No Comments »

Maglev and the naiivety of the Rails community

Monday, June 2nd, 2008

UPDATE: Corrected a couple of typos. Didn’t correct the spelling error in the title because I am enjoying being .

I would like to point out also that this is a rant about vapourware and miserably unmet standards of proof – the benchmarks at RailsConf are worthless and prove nothing, but I would dearly love to be wrong.

And also note that I said I consider a dramatically faster Ruby interpreter/VM impossible until conclusively proven otherwise. I didn’t say completely impossible; I hope it is in fact possible to speed up Ruby by 10x or more. It seems unlikely, very unlikely, but who knows. I am in no way an expert on these things, and do not claim to be; I am only reacting to their hype-filled presentation, and drawing comparisons to the recent history of everyone else’s experiences writing Ruby interpreters sans the 60x speedup.

The demonstration at Railsconf was useless, empty hype, and until extraordinary proof is presented, I will remain deeply skeptical of these extraordinary claims.

So there’s been some presentation at Railsconf 2008 about a product called “Maglev“, which is supposedly going to be the Ruby that scales™ (yes, they actually use the trade mark). This new technology is going to set the Ruby world on fire, it’s going to be the saving grace of all Rails’ scaling problems. It’s going to make it effortless to deploy any size Rails site. Its revolutionary shared memory cache is going to obsolete ActiveRecord overnight. It runs up to 60x faster than MRI. And it’s coming Real Soon Now.

Every rails blogger and his dog have posted breathless praise for the new saviour:

Slashdot | MagLev, Ruby VM on Gemstone OODB, Wows RailsConf
RailsConf 2008 – Surprise of the Day: Maglev
MagLev is Gemstone/S for Ruby, Huge News
MagLev rocks and the planning of the next Ruby shootout

So what’s the problem? Why am I being such a party pooper and raining on the new Emperor’s parade?

Because these claims are absolute bullshit and anyone with a hint of common sense should be able to see that.

Right now, there are about 5 serious, credible, working Ruby implementations – MRI, YARV, JRuby, Rubinius, and IronRuby. They all have highly intelligent, experienced, dedicated staff who know a lot more about writing interpreters and VMs than I could ever hope to learn.

So do you seriously think that all these smart people, writing (and collaborating on) all these projects have somehow missed the magic technique that’s going to make Ruby run 60x faster?

It’s definitely possible to get a 2x speedup over MRI and retain full compatibility – Jruby and YARV have shown us that. Maybe it’s possible to get a 3x or 4x broad-based speedup with a seriously optimised codebase. And sure, a few specific functions can probably be sped up even more.

But a broad 20x, 30x, 50x speedup across the whole language beggars belief. It is a huge technical leap and experience suggests they don’t just suddenly happen all at once. Speed gains are incremental and cumulative, a long race slowly won, not an instant teleport into the future. I’d say it is almost impossible, until spectacularly demonstrated otherwise, for a brand new, fully compatible ruby implementation to be more than two or three times faster than today’s best. Things just don’t work that way. Especially things with such a broad range of smart people working hard on the problem.

Extraordinary claims require extraordinary proof. But what do we get? A couple of benchmarks running in isolation. Who knows what they actually are, how tuned they are, whether they’re capable of doing anything other than running those benchmarks fast (I doubt it). No source. No timetable for the source, or anything else.

The bloggers say “this is not ready yet but when it is .. WOW!”. They’re missing the point. Until this thing is actually running Ruby, it’s not Ruby. Benchmarks on a system which isn’t a full implementation of Ruby are utterly worthless. I can write some routine which messes around with arrays in C which is a hundred times faster than Ruby. I might even be able to stick a parser on the front which accepts ruby-like input and then runs it a hundred times faster. Who cares? If it’s not a full implementation of Ruby, it’s not Ruby. Ruby is a very hard language to implement, it’s full of nuance and syntax which is very programmer-friendly but very speed-unfriendly. Until you factor all of that in, these benchmarks ain’t worth jack.

And wow ..! A shared memory cache! Finally, Rails can cast off that shared-nothing millstone around its neck. Except, of course, that shared-nothing is one of its main selling points and wasn’t everyone all on board that train until ten minutes ago? If you want to share objects use the database, something like that?

Oh yeah, the database! Maglev comes with a built-in OODB which is going to set the world on fire. Except of course that OODBs have been around for decades, and the world is not on fire. If OODBs were the solution to all scaling’s ills then Facebook would be using Caché, not MySQL. Guess which one they’re using.

I actually have problems with the whole premise of OODBs, at least as they apply in web applications. Great, you can persist your Ruby objects directly into the OODB. What happens when you want to access them from, say, anywhere else? What if you want to integrate an erlang XMPP server? What if you need Apache to reach into it? What if you want to write emails straight into it, or read them straight out? What if you want to do absolutely anything at all which isn’t a part of some huge monolithic stack? Web applications are all about well-defined protocols, standard formats, and because of those, heterogeneous servers working in unison. I’ve heard OODBs have some benefits in scientific and other niche uses, but web applications are about the most mixed environment imaginable. If using an OODB is the answer, what was the question?

Oh, you think I’m just an RDBMS-addicted luddite? Hell no. I eagerly follow and embrace advances in non-relational database technology – just look around this site, where I talk about being one of the first (crazy) people to press Couch DB into semi-production use, using TokyoCabinet and Rinda/Tuplespace for distributed hashtables, and how I’d much rather write a map/reduce function than a stupid, ugly, undistributable slow JOIN. But OODBs? Give me a break.

But oh no. Show them one bullshit-laden presentation and the entire Rails community is champing at the bit and selling both kidneys to ditch all previous Ruby implementations and everything they thought they knew about the persistence layer and embrace some questionable closed-source vapourware, from the guys who brought you that previous world-storming web framework Seaside. What’s that, you’ve never heard of Seaside? I wonder why.

This credulity and blind bandwagon-jumping is the single worst thing about the Rails community.

Tags: maglev, rails
Posted in lifestyle | 79 Comments »

Sho Fukamachi Online

Archive for June, 2008

Safari trick

N+1, where n > 2,000,000

More fun with ISO codes

Ruby Versioning Is Completely Fucked Up

Ruby compatibility debacle(s)

France first to start seriously blocking net

Firefox 3 is better than Safari

Sorry Lonely

ActiveRecord to DataMapper/CouchDB class header script

Faking 10.4U SDK for MacPorts

Rails sessions breaking in Ruby 1.8.7

Downtime, Upgrade

The revolution is not over, it hasn’t even begun

Geohashing using matrices in Ruby

No Django release since March 2007

Rails now generally running on Ruby 1.9.0

Maglev and the naiivety of the Rails community

Your Leader

Truth and Reconciliation

Pages

Archives

Categories