Maglev and the naiivety of the Rails community

UPDATE: Corrected a couple of typos. Didn’t correct the spelling error in the title because I am enjoying being .

I would like to point out also that this is a rant about vapourware and miserably unmet standards of proof – the benchmarks at RailsConf are worthless and prove nothing, but I would dearly love to be wrong.

And also note that I said I consider a dramatically faster Ruby interpreter/VM impossible until conclusively proven otherwise. I didn’t say completely impossible; I hope it is in fact possible to speed up Ruby by 10x or more. It seems unlikely, very unlikely, but who knows. I am in no way an expert on these things, and do not claim to be; I am only reacting to their hype-filled presentation, and drawing comparisons to the recent history of everyone else’s experiences writing Ruby interpreters sans the 60x speedup.

The demonstration at Railsconf was useless, empty hype, and until extraordinary proof is presented, I will remain deeply skeptical of these extraordinary claims.

So there’s been some presentation at Railsconf 2008 about a product called “Maglev“, which is supposedly going to be the Ruby that scales™ (yes, they actually use the trade mark). This new technology is going to set the Ruby world on fire, it’s going to be the saving grace of all Rails’ scaling problems. It’s going to make it effortless to deploy any size Rails site. Its revolutionary shared memory cache is going to obsolete ActiveRecord overnight. It runs up to 60x faster than MRI. And it’s coming Real Soon Now.

Every rails blogger and his dog have posted breathless praise for the new saviour:

Slashdot | MagLev, Ruby VM on Gemstone OODB, Wows RailsConf
RailsConf 2008 – Surprise of the Day: Maglev
MagLev is Gemstone/S for Ruby, Huge News
MagLev rocks and the planning of the next Ruby shootout

So what’s the problem? Why am I being such a party pooper and raining on the new Emperor’s parade?

Because these claims are absolute bullshit and anyone with a hint of common sense should be able to see that.

Right now, there are about 5 serious, credible, working Ruby implementations – MRI, YARV, JRuby, Rubinius, and IronRuby. They all have highly intelligent, experienced, dedicated staff who know a lot more about writing interpreters and VMs than I could ever hope to learn.

So do you seriously think that all these smart people, writing (and collaborating on) all these projects have somehow missed the magic technique that’s going to make Ruby run 60x faster?

It’s definitely possible to get a 2x speedup over MRI and retain full compatibility – Jruby and YARV have shown us that. Maybe it’s possible to get a 3x or 4x broad-based speedup with a seriously optimised codebase. And sure, a few specific functions can probably be sped up even more.

But a broad 20x, 30x, 50x speedup across the whole language beggars belief. It is a huge technical leap and experience suggests they don’t just suddenly happen all at once. Speed gains are incremental and cumulative, a long race slowly won, not an instant teleport into the future. I’d say it is almost impossible, until spectacularly demonstrated otherwise, for a brand new, fully compatible ruby implementation to be more than two or three times faster than today’s best. Things just don’t work that way. Especially things with such a broad range of smart people working hard on the problem.

Extraordinary claims require extraordinary proof. But what do we get? A couple of benchmarks running in isolation. Who knows what they actually are, how tuned they are, whether they’re capable of doing anything other than running those benchmarks fast (I doubt it). No source. No timetable for the source, or anything else.

The bloggers say “this is not ready yet but when it is .. WOW!”. They’re missing the point. Until this thing is actually running Ruby, it’s not Ruby. Benchmarks on a system which isn’t a full implementation of Ruby are utterly worthless. I can write some routine which messes around with arrays in C which is a hundred times faster than Ruby. I might even be able to stick a parser on the front which accepts ruby-like input and then runs it a hundred times faster. Who cares? If it’s not a full implementation of Ruby, it’s not Ruby. Ruby is a very hard language to implement, it’s full of nuance and syntax which is very programmer-friendly but very speed-unfriendly. Until you factor all of that in, these benchmarks ain’t worth jack.

And wow ..! A shared memory cache! Finally, Rails can cast off that shared-nothing millstone around its neck. Except, of course, that shared-nothing is one of its main selling points and wasn’t everyone all on board that train until ten minutes ago? If you want to share objects use the database, something like that?

Oh yeah, the database! Maglev comes with a built-in OODB which is going to set the world on fire. Except of course that OODBs have been around for decades, and the world is not on fire. If OODBs were the solution to all scaling’s ills then Facebook would be using Caché, not MySQL. Guess which one they’re using.

I actually have problems with the whole premise of OODBs, at least as they apply in web applications. Great, you can persist your Ruby objects directly into the OODB. What happens when you want to access them from, say, anywhere else? What if you want to integrate an erlang XMPP server? What if you need Apache to reach into it? What if you want to write emails straight into it, or read them straight out? What if you want to do absolutely anything at all which isn’t a part of some huge monolithic stack? Web applications are all about well-defined protocols, standard formats, and because of those, heterogeneous servers working in unison. I’ve heard OODBs have some benefits in scientific and other niche uses, but web applications are about the most mixed environment imaginable. If using an OODB is the answer, what was the question?

Oh, you think I’m just an RDBMS-addicted luddite? Hell no. I eagerly follow and embrace advances in non-relational database technology – just look around this site, where I talk about being one of the first (crazy) people to press Couch DB into semi-production use, using TokyoCabinet and Rinda/Tuplespace for distributed hashtables, and how I’d much rather write a map/reduce function than a stupid, ugly, undistributable slow JOIN. But OODBs? Give me a break.

But oh no. Show them one bullshit-laden presentation and the entire Rails community is champing at the bit and selling both kidneys to ditch all previous Ruby implementations and everything they thought they knew about the persistence layer and embrace some questionable closed-source vapourware, from the guys who brought you that previous world-storming web framework Seaside. What’s that, you’ve never heard of Seaside? I wonder why.

This credulity and blind bandwagon-jumping is the single worst thing about the Rails community.

Tags: maglev, rails

This entry was posted on Monday, June 2nd, 2008 at 2:11 am and is filed under lifestyle. You can follow any responses to this entry through the feed. You can leave a response, or trackback from your own site.

78 Responses to “Maglev and the naiivety of the Rails community”

Wincent Colaiuta Says:
June 2nd, 2008 at 7:36 pm
While I could nitpick some little points in your argument the basic premise is inarguable: you’re completely right that a 60x performance benchmark in isolation is utterly useless, and you’re also right that the Rails community’s endemic bandwagon-jumping is one of its most distinguishing characteristics.

I’ve been tempted to write a weblog post about it for some time now, in response to the way the Rails community has embraced things like Git, Phusion Passenger, and the way they’re now responded to Maglev. While some of these things may actually be demonstrably superior (Git, for example), the worth of others is yet to be proven, yet already people are jumping overboard to fling the latest craze into production use. Even with something like Git the way everyone flocked to it (and more disturbingly, the way they conflated free, open-source Git with commercial, closed-source GitHub) is a cause for concern. Large open source projects tend to switch version control systems at a glacial pace but the Rails community did it like a bunch of lemmings and I suspect that many a baby was thrown out with the bath water.

This is a post that nicely illustrates the phenomenon:

Since its launch in April, Passenger has become quite popular and a lot of developers are already using it to rapidly deploy Rails sites. Even popular budget Web hosting company Dreamhost has got in on the action, and is offering cheap, Passenger-based Rails application hosting. The de-facto Ruby (and Rails) deployment system seems to change rapidly (remember Apache+FastCGI, then lighttpd+FastCGI, then Apache+Mongrel, then Nginx+Mongrel…?) and while Passenger may or may not be a de-facto standard in a few years’ time, it’s certainly becoming the standard for now, so jump on board!

I find this kind of response puzzling at best. First of all, it’s debatable that some of the systems he mentions ever achieved “de facto” status, and secondly, his premature assessment that passenger is becoming “the standard for now” is either ridiculously optimistic or sadly realistic (depending on your point of view), and finally the suggestion that you should “jump on board” is quite irresponsible given that people make a living off Rails apps and their deployment choices shouldn’t be made frivolously.

In any case, even if Maglev proves to be 5 times faster than MRI I won’t be switching. Ruby simply isn’t the bottleneck in my application, and my vendor (Red Hat) takes platform stability seriously, so you won’t see them changing Ruby implementations any time soon.
markus Says:
June 2nd, 2008 at 8:05 pm
‘Ruby simply isn’t the bottleneck in my application, and my vendor (Red Hat) takes platform stability seriously, so you won’t see them changing Ruby implementations any time soon.’

But you are already dependent on others here. Other people might want to trade different dependencies with other dependencies, so from this point of view IF one more serious contender is in the boat, it is not bad. We just exchange dependencies in the hope that some things will improve (like in your case, that Red Hat will not f*ck up something)

I am sceptical too though.
dm Says:
June 2nd, 2008 at 8:12 pm
Ask the Twitter people if they are happy with their choice of MySQL in their web-app.
Sho Says:
June 2nd, 2008 at 8:27 pm
Wincent:

While I could nitpick some little points in your argument the basic premise is inarguable

Oh, do let me know if I’m off the track somewhere .. I’ve actually discovered the shared memory and OODB are one and the same, so that’s at least one error.

Markus: Red Hat is a company with a long history of reliabilty in its products – it’s difficult to think of many other companies who can boast such a tradition of quality. IBM, maybe, with its mainframe software. If you need to rely on a third party for your project, RH is a pretty good party to choose, IMO.

dm:

Ask the Twitter people if they are happy with their choice of MySQL in their web-app.

As I understand it Twitter’s scaling problems are nothing to do with MySQL, or Ruby for that matter, and everything to do with trying to force an application designed as a simple web page to act as a high volume messaging system.

MySQL is perfectly adequate when used appropriately – FaceBook and Mixi are ample proof of that. The fact that Twitter hasn’t been able to make it do what they want (act as a router, apparently!) says nothing. And how on earth would an OODB make any difference?
Me Says:
June 2nd, 2008 at 8:39 pm
Since you seem unwilling to believe that a speedup of 100x is possible with Ruby, I wonder; have you read much about the various Smalltalk VMs? If not, I suggest you do so, and maybe even play with Smalltalk and get a feel for its speed, as well as its similarity to Ruby.
John thomas Says:
June 2nd, 2008 at 9:03 pm
Fascinating. Totally.

JJ
http://www.Ultimate-Anonymity.com
Damien Pollet Says:
June 2nd, 2008 at 9:40 pm
Hmm. While I agree that particular benchmarks are no proof, I find your post has that “come on this can’t possibly be true” tone of people that don’t want to accept a change in perspective.

You can’t just say “these particular VMs are slow therefore all of them must be”.
There are many VM implementation techniques (from the Smalltalk and Lisp communities in particular) that are quite efficient though maybe not well publicized outside the academic world. But for sure, the Hotspot JVM and good Smalltalk VMs have the kind of performance the Gemstone guys are talking about, and as Avi Bryant said at RailsConf 2007, Ruby and Smalltalk look VERY similar under the syntax differences. Except Smalltalk is actually way faster even though all the library code is pure Smalltalk (collections, strings, classes, the UI widgets and graphics layer, the compiler, you name it).

Then about OODBs we all know that technologies don’t get adopted for their technical qualities only: everyone uses SQL because everyone uses SQL. You can’t dismiss OODBs because ZODB isn’t popular either.
Paw Prints » Blog Archive » MagLev Says:
June 2nd, 2008 at 9:48 pm
[...] the positive impression the presentation made on the viewers but a few bloggers (see here and here) aren’t so easily impressed. I have to admit these guys make some solid points which [...]
peter Says:
June 2nd, 2008 at 9:57 pm
You’re on crack. None of the Ruby implementations, except Rubinius (which is quite immature) have been architecturally optimized for speed. A properly optimized dynamic JIT would give about a 10x performance improvement. I can see it being 5x on some benchmarks and 60x on others, but 10x would be pretty reasonable.

You can get a similar performance improvement over an RDBMS, although at great sacrifice. The reason for SQL is that it is standard — if your data is in a SQL RDBMS, anything can get at it, and any programmer can talk to your database. The tradeoff is that it is very inefficient. Small queries need to be turned into ASCII and then parsed. Big queries use a table data structure that, while highly optimized, is just not efficient for many types of queries. Rails hides the database for you, so you’ve got a Rails->SQL->underlying data structure layer. Doing something Rails-optimized should also be dramatically faster.

I’m not saying the Maglev guys didn’t fuck it up (I suspect they did), but in abstract, what they’re talking about is very possible (and not too hard, conceptually — just a ton of work).
René Ghosh Says:
June 2nd, 2008 at 10:00 pm
You say:
“So do you seriously think that all these smart people, writing (and collaborating on) all these projects have somehow missed the magic technique that’s going to make Ruby run 60x faster?”

…Which is actually a logical fallacy, called Appeal to Authority.
http://en.wikipedia.org/wiki/Appeal_to_authority
Donovan Says:
June 2nd, 2008 at 10:01 pm
Lots of very brilliant programmers get good ideas, write them, deploy them, only to discover things aren’t quite what they’d expected, or that people using the code haven’t quite gotten the philosophy. Or maybe their idea was good in a med/small environment and turns out not to be so hot in a large one… *cough*.

I’m not saying the originators of ruby and packages aren’t brilliant. BUT while I’m not as smart as some of the guys I’ve worked with I have still managed to improve on their code and frameworks…

Because the situation, and so my understanding of it, was more ‘mature’.
One example….
‘THE WHEEL’ (aztecs were effin smart, but they didn’t use the wheel much… until roads got flatter)

Luck.
Timothy Says:
June 2nd, 2008 at 10:20 pm
Your hover colors on links are seizure inducing… @_@
Ferdinand Svehla Says:
June 2nd, 2008 at 10:43 pm
Wow, your link hover effect is annoying.
Wincent Colaiuta Says:
June 2nd, 2008 at 11:19 pm

Ask the Twitter people if they are happy with their choice of MySQL in their web-app.

I’ve read a on Twitter’s scaling problems, but mostly out of academic interest. 99.999% (that’s five nines) of Rails developers will never face problems like those; on the contrary, our performance problems tend to be of a completely different nature and have different solutions.

So while MySQL may or may not be part of Twitter’s problem, it’s certainly not part of mine (or most people’s) so the issue is of little interest to me.

And while I’m here, here’s another analysis of Maglev.
Mr. Rosenblatt: The Blog » The tortoise was right. Says:
June 2nd, 2008 at 11:35 pm
[...] lol. But oh no. Show them one bullshit-laden presentation and the entire Rails community is champing at the bit and selling both kidneys to ditch all previous Ruby implementations and everything they thought they knew about the persistence layer and embrace some questionable closed-source vapourware, from the guys who brought you that previous world-storming web framework Seaside. What’s that, you’ve never heard of Seaside? I wonder why. [...]
StCredZero Says:
June 2nd, 2008 at 11:37 pm
60X Speed increase beggars belief? For general purpose applications, it probably won’t be the case. But a 10X increase seems quite likely. By Smalltalk VM standards, the Matz Ruby VM is sinfully sloooooow! It’s no wonder. It’s traversing the AST to interpret code, whereas most of the modern Smalltalk JIT VMs have been in development for well over 10 years. (Some of them for over 20 years.)

I know of Smalltalk >encryption< functions that ran faster than their C counterparts. (Mostly due to naive use of malloc and free in the C program — Generational GC is *faster* than malloc & free!)

The idea that Smalltalk is slow is basically an unfounded stereotype.

http://duimovich.blogspot.com/2006/09/performance-is-not-optional.html
Martijn Faassen Says:
June 2nd, 2008 at 11:40 pm
Speaking as a Zope developer, you’re incorrect about one thing: getting rid of the ZODB is certainly not what Zope developers commonly do. We’ve been using the ZODB in production for about 10 years now and the ZODB is still the first choice for Zope-based projects.

Using an object database has advantages and disadvantages compared to an RDB. We’ve seen a lot of recent activity in the Zope community to integrate mature powerful ORM solutions such as SQLAlchemy and Storm into Zope. That’s because for some projects, an RDB is nicer. For other projects, an OODB is nicer. I’d definitely want RDBs to be a first-class citizen in the Zope world, but I wouldn’t want the ZODB to be a second-class citizen either.

So, as someone who has been using an OODB for a long time: they can be very nice. They also have drawbacks. They make certain problems go away but replace them with other problems. Of course an OODB isn’t magic pixie dust that will make all data storage problems go away, that’s indeed a naive perspective.

One observation: I believe the use of the ZODB has contributed to the success of open source CMS projects like Plone. The ZODB doesn’t need everybody to agree on a database schema in order to write an extension that provides a new type of content. I believe this can allow an open source project to be more distributed in nature.

By the way, I agree with your main point that this whole Maglev is currently vaporware and until it shows otherwise should not be hyped too much.
Ramon Leon Says:
June 2nd, 2008 at 11:46 pm
Don’t know much about Smalltalk do you?

Maglev doesn’t come with a built in OODB, Maglev is an OODB, specifically a modified version Gemstone Smalltalk, the biggest baddest OODB ever made. It runs little things like giant trading firms in the stock martket and giant shipping container firms doing tens of thousands of transactions a second in terabyte sized databases across clusters of up to a thousand computers.

Smalltalk is as dynamic as Ruby and Smalltalk VM’s are some of the fastest VM’s around with near 30 years research in them. Maglev isn’t some brand spanking new thing, it’s Smalltalk with a few extra byte code doing Ruby syntax.

“So do you seriously think that all these smart people, writing (and collaborating on) all these projects have somehow missed the magic technique that’s going to make Ruby run 60x faster?”

Yes, they missed the fact that Ruby is Smalltalk in disguise and that Smalltalk VM’s are years and years ahead of anything they could cook up from scratch. The JVM started out life as a self VM, which was a modified version of Smalltalk. If they were truly that smart they’d have started off with a Smalltalk VM rather than from scratch.
NY Says:
June 3rd, 2008 at 12:01 am
“This credulity and blind bandwagon-jumping is the single worst thing about the Rails community.”

Hey! Credulity and blind bandwagon-jumping are what got many of us *into* the Rails community in the first place!
KW Says:
June 3rd, 2008 at 12:04 am
I think you’re mistaking rails bloggers for rails users.

Rails users are the people who are interested in what it can do today, and have come up with scaling solutions that work for their application and workload.

Rails bloggers are the people who need to write something new about rails *every day*, a task that is only possible if you talk about every fad that walks down the street.
Jesse Says:
June 3rd, 2008 at 12:09 am
As I say on programmermeetdesigner.com, “I’m not a Rails expert, but who is?”

Thanks for being here!
Jacques Chester Says:
June 3rd, 2008 at 12:27 am
The 60x thing might be a headline-grabbing exaggeration. On the other hand, GemStone have been working on the technology of VMs for dynamic languages since the early 80s. They and their competitors have more experience in this field than anyone; it seems reasonable that they might have some idea of how to make ‘em run fast.

It strikes me as a non-sequiter to sing the praises of 3NF, ActiveRecord and MySQL when the first doesn’t use RDBMS features to maintain integrity and the latter doesn’t honour foreign key constraints in the default table storage engine.
David Says:
June 3rd, 2008 at 1:13 am
I think you made a typo here:

‘function then a stupid’ should be ‘function _than_ a stupid’

Otherwise, I agree completely with your article. You always have to take this extraordinary claims with a grain of salt.

Cheers,
David
Jeremy Says:
June 3rd, 2008 at 1:29 am
Why we iz jus kuntry Railz bumpkinz that dont know no bettr than 2 ack lik sumpins awezum!!!!11

Please. Just because a few Rails bloggers have nerdgasms doesn’t mean that we’re all that crazy about it. I think it’s great and has potential, but I also realize that it’s not very far along and it’s likely that it hasn’t even hit the hard points of implementation.
Jon E Rotten Says:
June 3rd, 2008 at 1:39 am
I think you’re missing the point a bit. You make very good points about the industry as a whole, and I agree that hype is usually not-lived-up-to… but in this case I expect it will be. What’s irrelevant here is Ruby… this is not a Ruby product; it just happens to use Ruby because Ruby has a lot of momentum behind it right now. Comparing it to other efforts to “make Ruby fast” is like comparing apples and … (insert non-apple thing here) … This is not about making Ruby fast. It’s about making a huge, scalable system that can ease deployment woes. It’s about changing the ’stack’ paradigm of web apps and re-thinking >>why<< we store shit in a DB when we often don’t need to. It’s about your code and objects all living in the same space and being alive.

As web developers, we’re always jumping through hoops; hoops to make HTTP seem less stateless, hoops to persist our object graphs in DBs, hoops to deal with those objects… but with MagLev, we don’t need to do that any more. All your objects are “just there”. Everything is stateful. And if you need more power, just add another MagLev instance. What Avi & co are doing is trying to ‘correct’ some problems with web development; They’ve already done it with Seaside, and have shifted their focus to Ruby rather than Smalltalk because it will obviously reach more people and have a greater effect.

Good luck to them. I hope they succeed.
Patrick Collison » blog Says:
June 3rd, 2008 at 2:05 am
[...] Fukamachi has a spectacularly uninformed piece on MagLev and language implementation. He writes: There are about 5 serious, credible, working Ruby [...]
jj12345 Says:
June 3rd, 2008 at 2:21 am
“Ramon Leon Says:

June 2nd, 2008 at 11:46 pm
Don’t know much about Smalltalk do you?

Maglev doesn’t come with a built in OODB, Maglev is an OODB, specifically a modified version Gemstone Smalltalk, the biggest baddest OODB ever made. It runs little things like giant trading firms in the stock martket and giant shipping container firms doing tens of thousands of transactions a second in terabyte sized databases across clusters of up to a thousand computers.”
[Citation Needed]

Who exactly uses this Gemstone Smalltalk OODB? Provide references please.

A search on Dice.com shows less than 10 Gemstone positions in the U.S. A search on Monster.com shows about 2 positions for Gemstone OODB.
Sho Says:
June 3rd, 2008 at 2:44 am
Woah, too many replies. I hope people don’t mind if I do a “bulk reply.”

Me:

I wonder; have you read much about the various Smalltalk VMs? If not, I suggest you do so, and maybe even play with Smalltalk and get a feel for its speed, as well as its similarity to Ruby.

You’re right in that I don’t have all that much familiarity with Smalltalk besides its basic syntax, which isn’t all that similar beyond its terse OO nature. However, it seems to me that Smalltalk is far less flexible than Ruby, while being semantically much more machine-friendly. I’d assume that makes it easier to implement, although I’m not an expert.

What I do know is that the many people who have worked on the numerous interpreters for Ruby are surely aware of prior work on the Smalltalk VMs and would surely have incorporated any magic bullets they found. JRuby and IronRuby are by guys straight from the Java and .NET VMs – if there was an easy 10x speedup to be copied from the Smalltalk approach do you seriously think they would have just passed over it?

Damien Pollet:

You can’t just say “these particular VMs are slow therefore all of them must be”.

Well I’m not trying to say definitively that it is impossible to make a faster VM. I am just saying that considering the history of a great many very smart people doing their level best to make a faster VM, and only being able to squeeze out 2x or 3x over MRI despite their multi-year efforts, powerful corporate backing, and scrutiny of the entire community, I find it pretty incredible that some previously unknown project can just pull 15x-60x out of its hat. What secret VM design alchemy do they know that the rest of the world doesn’t?

Their partial benchmarks probably do run 60x faster than MRI. However, only a full implementation of Ruby is useful. When I see a proper Ruby program like Rails verifiably running 60x, 15x, or hell even 5x faster than MRI then I’ll be the most impressed guy on earth and will post a picture here of myself eating a newly bought hat. However, needless to say, I don’t expect to and neither should anyone else. In my humble opinion.

Then about OODBs we all know that technologies don’t get adopted for their technical qualities only: everyone uses SQL because everyone uses SQL. You can’t dismiss OODBs because ZODB isn’t popular either.

There are any number of sites who have completely abandoned SQL in pursuit of speed and scalability. Google, Ebay, Amazon – all use completely home-grown technology with little in common with SQL. Exactly none of them use an OODB implementation.

Yes, the market gets locked-in to inferior techniques sometimes just because of their momentum – but I don’t think this would stop sites who are otherwise doing anything they can think of to improve and scale their DBs from adopting OODBs if they actually were superior. I am not aware of *any* large site using anything I would call an OODB, although admittedly it seems the definition of OODB varies according to who you ask.

I don’t want to mouth platitudes like “the market has spoken” but given the ultra-competitive IT marketplace and the desperate need for scalable databases, the fact that OODBs have not made any appreciable inroads at all kind of points in that direction.

peter:

You’re on crack. None of the Ruby implementations, except Rubinius (which is quite immature) have been architecturally optimized for speed.

No, you’re on crack. What do you think they’re being optimised for? Code prettiness?

A properly optimized dynamic JIT would give about a 10x performance improvement.

Well, I’m not an expert on writing JIT compilers but it seems like if it was as easy as you make out, someone would have at least started on one.

René Ghosh:

Which is actually a logical fallacy, called Appeal to Authority.

It’s more of an argument to prior experience and probability. I didn’t say “Matz said it’s impossible, so it’s not”. I said that given the number of very smart people who have been working on a multitude of Ruby VMs, including the core Ruby team, and the fact that they haven’t come up with anything even remotely like the speed gains this company is claiming, I’m going to need a pretty convincing demonstration to believe it’s not useless hype. Running a couple of incomplete closed-source benchmarks is nowhere near that.
vegai Says:
June 3rd, 2008 at 3:12 am
Sure, comparing to other ruby implementations, MagLev seems incredible. But you forget that ruby implementations (yes, that includes the new ones as well) are not very good in any sense. Compare MagLev to other dynamic language implementations, and the numbers start making sense.
Sho Says:
June 3rd, 2008 at 3:22 am
Jon E Rotten:

As web developers, we’re always jumping through hoops; hoops to make HTTP seem less stateless, hoops to persist our object graphs in DBs, hoops to deal with those objects… but with MagLev, we don’t need to do that any more. All your objects are “just there”. Everything is stateful. And if you need more power, just add another MagLev instance. What Avi & co are doing is trying to ‘correct’ some problems with web development; They’ve already done it with Seaside, and have shifted their focus to Ruby rather than Smalltalk because it will obviously reach more people and have a greater effect.

What?!

Now I want to know what the Maglev guys were putting in the water at Railsconf. Using Maglev is going to make HTTP stateful, is it?

Man, I think I better go check out exactly what they said – this sounds even more ridiculous. And this Seaside, which has already fixed all problems with web development, sounds worth a look as well. And there I was thinking it was only a vapourware VM.
Brige McTrollerson Says:
June 3rd, 2008 at 3:37 am
Whý doń’t you understand that yoúŕe wrong? The inferencés I pull from my mental static clearly contradict you’res.
Sho Says:
June 3rd, 2008 at 3:38 am
Ramon Leon:

Don’t know much about Smalltalk do you?

Maglev doesn’t come with a built in OODB, Maglev is an OODB, specifically a modified version Gemstone Smalltalk, the biggest baddest OODB ever made. It runs little things like giant trading firms in the stock martket and giant shipping container firms doing tens of thousands of transactions a second in terabyte sized databases across clusters of up to a thousand computers.

So the machines are only doing 10 transactions a second each?! Gemstone doesn’t scale!!

Seriously, I have never heard of this company. I have never heard of anyone using their Gemstone/S “Object Server”, which is everyone’s #1 “platform for developing, deploying, and managing scalable, high-performance, multi-tier applications based on business objects”, whatever they are.

On the product page I see testimonials from a travel company who used it for their intranet, and a shipping company that used it for their customer service web site.
Will Schenk Says:
June 3rd, 2008 at 3:48 am
While I agree with the basic point you’re making, namely making these claims without a) running full ruby and b) anyone one else being able to look at it because it’s not released yet, 60x speed up of ruby is totally reasonable.

Smalltalk has been around for years, and indeed most of the advantages that JVM has now is because Sun repurposed the smalltalk vms of the day. (They bought a smalltalk company to get to hotspot stuff.) This stuff is faster than C; not merely compiling into machine code but able to re-optimize the code based upon it’s real-time running characteristics. You don’t know smalltalk it seems, at least you seem to imply that ruby is somehow more dynamic or difficult to code and optimize for. This is false; the syntax aside I would actually say that smalltalk would be more ‘dynamic’. But that’s just my opinion.

Remember that rubinous is actually based off of an ancient smalltalk vm design from the 80s. That’s 20 years ago. A lot of stuff has happened since then. Lets also remember that the real source of this stuff, the lisp machines, were able to do all these things very quickly and in hardware. So it’s not that the technology isn’t there; it is. And some of it, like the lisp vms, have been around for 40 years. Why don’t people use it? Because its fucking complicated, not that it doesn’t work.

Things would have been very different if java hadn’t pushed off smalltalk with a combination of sun marketing might and a shrewed appeal to the C++ programmers. But, sad to say, it did. That battle wasn’t won on technical prowess.

Remember these systems can be faster than C, mainly because they can be very clever on how to do function in lining dynamically. (i.e. they can optimize things in a way that static compiling can’t.) And MRI is also embarrassingly slow. So while the maglev guys should probably ease up with the announcements until they have something more thorough to show, the claim itself is not unreasonable.
Ramon Leon Says:
June 3rd, 2008 at 4:24 am
“So the machines are only doing 10 transactions a second each?! Gemstone doesn’t scale!!”

I never said that, from what I read, I’ve seen people reporting tens of thousands of transactions a second in some cases. Gemstone claims petabype sized database support, and up to a thousand servers (from what I recall) in a farm. I did not say you needed a thousand servers to get those transaction rates. The point is, all those numbers pretty much blow away any notion that object databases don’t scale or can’t do what relational db’s can do.

“Seriously, I have never heard of this company.”

That doesn’t really mean anything, I’m sure there are thousands of companies doing all kinds of cool stuff neither of us have heard of, that’s no reason to call it bogus. People who have used Gemstone go gaga over it, every single testimonial I’ve ever seen, they’ve gone gaga over it. But it’s always been extremely expensive and until recently, there wasn’t a free version you could use for commercial use, so it got no love from the OSS community or web 2.0.

Clearly, they’re trying to change that by offering a limited free version that you can use commercially. 4 gig is plenty useful for commercial apps, and you can use as many 4 gig db’s as you like, so if you partition well, you can use it for quite a while. Up to 16 gig db for only 7k… very reasonable.

I’d venture to say most Ruby people haven’t heard of Gemstone, but you’ll find most Smalltalk’ers have and just haven’t been able to afford it. Gemstone getting into the Ruby game is a big deal, and you’ll see soon enough, they’re to be taken very very seriously.
StCredZero Says:
June 3rd, 2008 at 4:26 am
Sho, your comments like, “Seriously, I have never heard of Gemstone,” are just revealing your ignorance. Your blog post reveals a lot of ignorance about Virtual Machines. But the real kicker is the irony of your words “naiivety of the Rails community.” This mostly refers to yourself.

Oh, and “Smalltalk is less flexible than Ruby?” Uh, no. Any serious language maven will tell you that they are almost identical in many respects, but that Smalltalk wins because you can not only modify the entire library, you can refactor base classes completely out of existence, and you can arbitrarily change the grammar. (And there’s even more!) This is precisely why it’s relatively easy to put Ruby on top of the Smalltalk VM. If you can bootstrap a Ruby parser into a Smalltalk image, the Smalltalk image can incrementally morph itself into Ruby.

(And another goal of the Rubinius project is to give Ruby the same level of meta-capabilities already present in Smalltalk!)

Sho, you’re just digging yourself deeper the more you post without informing yourself. You don’t even know enough to know how much you don’t know about this subject. “Naiivety of the Rails community,” indeed!
Sho Says:
June 3rd, 2008 at 4:33 am
Will, thanks for the informative comment.

Had I known that this rant on my very unprofessional personal blog would receive any attention at all, I probably would have written “incomplete benchmarks prove nothing” and left it at that. At the time of writing I didn’t even know they were JIT-ing into bytecode; I thought it was another interpreter and my tone was informed by that presumption. Ah well, too late now.

There may well be scope to greatly improve the performance of Ruby when compiled into bytecode and run on a VM. This is indeed exciting and I’ll be first in line to use it if the performance claims come true. However, I stand by my “core” argument that until the product is demonstrated running a fully compatible implementation of Ruby then the benchmarks being thrown around here are utterly meaningless and the unquestioning hype from the community reflects very badly on everyone involved.

That said, I’ll definitely start playing around with smalltalk more now that so many people have spoken passionately in its favour! So, it hasn’t been a complete loss …
More thinking about Ruby on Rails :: In140 Says:
June 3rd, 2008 at 4:35 am
[...] know how i wrote earlier about Ruby on Rails. Well this post about Maglev (i thought it was a train) for RoR kinda sums it all up for me 2 / June / [...]
StCredZero Says:
June 3rd, 2008 at 4:42 am
Okay, in the spirit of more constructive and still informative comments:

- You are going in the right direction when you wonder if Smalltalk is easier to optimize. Smalltalk has very little syntax, so it is very easy to write a parser for it. This gives a huge leg up to VM implementors. The cost of entry is low, so you end up with “more eyeballs.” Ruby was also hobbled by the lack of a bytecode standard. The Abstract Syntax Tree walking that the Matz VM started out doing is not very portable, and was not expressed as any kind of standard. This is also a barrier to robust and open VM development. So good on you for some insight here.

- It takes a long time to properly debug and tune a JIT VM. This is why the 20 years head start the Smalltalk VMs had is significant. Do you remember how rocky the first JIT VMs were for Java?
Charl Says:
June 3rd, 2008 at 4:43 am
Surely OODB (as a pattern) is far far more prevalent than you give it credit for. To me (as a Python developer who uses both Zope ZODB and RDBs as useful tools available from the toolbox), the most prevalent of OODBs seems to be your typical hierachical file system.

Now we do know that (at least at one time) web apps at e.g. Yahoo stored their data in files, rather than in something like Oracle. This seems to have pretty similar characteristics to an OODB, such as hierachical storage of dissimilar things, and even basic atomic transaction capability. It even has the same downsides (such as having to think explicitly about things like indexing and how objects are searched for).

So arguably the layered technologies (such as those at Google) basically provide a raw file (aka object) store, with various levels of additional query capabilities on top of this.

At the simplest level, what something like the ZODB does, is merely to decrease the amount of code one has to write to increase the usefulness of the object store (e.g. I don’t have to worry about how to parse my objects from and to files, or even when to do it).

So you will likely find that those who don’t use RDBMS but still somehow end up storing data in files, are using their file store combined with app code which essentially “quacks” like an OODB.
Sho Says:
June 3rd, 2008 at 4:49 am
StCredZero:

Sho, your comments like, “Seriously, I have never heard of Gemstone,” are just revealing your ignorance.

Oh come on. How many others at Railsconf could honestly say otherwise? Sure seems to be a lot of instant experts 10 minutes after the presentation, though.

Your blog post reveals a lot of ignorance about Virtual Machines. But the real kicker is the irony of your words “naiivety of the Rails community.” This mostly refers to yourself.

No, it’s meant to refer to the countless bloggers mindlessly echoing the fantastic claims from a for-profit company hyping its vapourware product.

Thanks for clueing me in on my own naiivety, though – I’ll make sure to mindlessly swallow any and all empty hype for vapourware in the future.

Oh, and “Smalltalk is less flexible than Ruby?” Uh, no. Any serious language maven will tell you that they are almost identical in many respects, but that Smalltalk wins because you can not only modify the entire library, you can refactor base classes completely out of existence, and you can arbitrarily change the grammar. (And there’s even more!) This is precisely why it’s relatively easy to put Ruby on top of the Smalltalk VM. If you can bootstrap a Ruby parser into a Smalltalk image, the Smalltalk image can incrementally morph itself into Ruby.

Well sorry for not being a “serious language maven” like yourself. But if it’s so easy, why hasn’t it been done already? Does this company have some secret unavailable to the rest of the world?

Sho, you’re just digging yourself deeper the more you post without informing yourself. You don’t even know enough to know how much you don’t know about this subject. “Naiivety of the Rails community,” indeed!

What have I said that’s wrong?

My point is that extraordinary claims require extraordinary proof, and that proof has not been given. The chorus of hype for a company’s vapourware product from the Rails community has been shamefully lacking in examination of the claims.

I don’t have to be an expert in virtual machines or Smalltalk to know that the claims made by this company are way beyond what every previous Ruby implementation has been able to achieve, and to express my deep skepticism until the claims are fully substantiated.
Sho Says:
June 3rd, 2008 at 5:15 am
StCredZero:

Okay, in the spirit of more constructive and still informative comments:

How magnanimous of you.

- You are going in the right direction when you wonder if Smalltalk is easier to optimize. Smalltalk has very little syntax, so it is very easy to write a parser for it. This gives a huge leg up to VM implementors. The cost of entry is low, so you end up with “more eyeballs.” Ruby was also hobbled by the lack of a bytecode standard.

Weren’t you just lambasting me about how Smalltalk is actually more flexible than Ruby? Of course I was talking about the syntax; Ruby’s is famously hard to implement. We are talking about interpreters right?

And Ruby was also hobbled by the lack of even a proper spec! The creation of that is a real breakthrough, hopefully we’ll see a lot more people trying their hand now.

The Abstract Syntax Tree walking that the Matz VM started out doing is not very portable, and was not expressed as any kind of standard. This is also a barrier to robust and open VM development. So good on you for some insight here.

We have to give Matz some credit. He did what he needed to do to make the language he wanted. He didn’t, and doesn’t, owe us anything and we don’t have much of a right to complain about his implementation details.

However, you’re right, there has long been a depressing lack of specifications for implementations of Ruby VMs. The good news is that progress on that front has sped up greatly and the barrier to entry is dropping. I have no doubt that Gemstone’s decision to go ahead with their Maglev project was made much easier by the existence of the new common spec.

- It takes a long time to properly debug and tune a JIT VM. This is why the 20 years head start the Smalltalk VMs had is significant. Do you remember how rocky the first JIT VMs were for Java?

I think everyone does; Java’s never shaken its reputation for slowness, although in reality it’s actually pretty damn good these days.

Saying that it takes 20 years is a bit much, though.

Look, I think you’ve got the wrong idea. I am not dissing smalltalk. I am not dissing Gemstone. I am dissing bloggers who mindlessly parrot corporate presentations without question.

If Gemstone can deliver on their promises then that will be an absolute boon to Ruby. All I want is for people to hold off on the hype until *after* they deliver, or at least nearer that time.

Invalid pseudo-benchmarks demonstrating unheard-of speed gains using some far-off unproven product are a really, really bad practise. Why are you disagreeing with me?!
Sho Says:
June 3rd, 2008 at 5:36 am
Charl:

Surely OODB (as a pattern) is far far more prevalent than you give it credit for [...] the most prevalent of OODBs seems to be your typical hierachical file system.

Uh .. right. Perhaps we have radically different understandings of what an OODB is?

Now we do know that (at least at one time) web apps at e.g. Yahoo stored their data in files, rather than in something like Oracle. This seems to have pretty similar characteristics to an OODB, such as hierachical storage of dissimilar things, and even basic atomic transaction capability. It even has the same downsides (such as having to think explicitly about things like indexing and how objects are searched for).

So arguably the layered technologies (such as those at Google) basically provide a raw file (aka object) store, with various levels of additional query capabilities on top of this.

At the simplest level, what something like the ZODB does, is merely to decrease the amount of code one has to write to increase the usefulness of the object store (e.g. I don’t have to worry about how to parse my objects from and to files, or even when to do it).

So you will likely find that those who don’t use RDBMS but still somehow end up storing data in files, are using their file store combined with app code which essentially “quacks” like an OODB

Right … we are on completely different wavelengths here. My understand of OODBs is based on things like Caché. If a bunch of flat files in an hierarchical filesystem is also considered an OODB then .. uh, OK.

To me an OODB is a database that attempts to persist the object basically exactly the same as it was in memory. You don’t have to worry about separate finds or typecasting or anything, you just grab stuff out of it like it was in memory and it handles all the details for you. So in rails, for example, you might have your array .. and that *is* the database. All your users are in , and you just select out of that or whatever.

I’ve never heard anyone claim that flat files were “objects” from a DB perspective, unless the DB you’re talking about is the disk catalogue.
Uninformed commentary at Pensieri di un lunatico minore Says:
June 3rd, 2008 at 6:05 am
[...] of all capabilities and skill sets to shoot off their mouth and make fools of themselves. Recently, Sho Fukamachi demonstrated the truly epic capability to not only miss the entire point, but demonstrate a nearly [...]
Rob Says:
June 3rd, 2008 at 6:16 am
AAH this is drinving me INSANE.

http://www.merriam-webster.com/dictionary/naiivety

i believe the word you’re all looking for is: NAIVETE
Says:
June 3rd, 2008 at 6:23 am
you’re wrong.
Charl Says:
June 3rd, 2008 at 7:17 am
If you want to persist an object, its impossible to just *poof* make that object appear on disk or somewhere else — instead it first has to be serialized into a stream of bytes to be stored or transmitted (and obviously de-serialized back into a memory object when you call on it — even if this happens transparently for you and you don’t have to care about it).

In the context of something like the ZODB, you can merrily set and get attributes on objects, and most of the serialization happens behind the scenes — the objects simply appear to “be there” as they are needed. Here the python “pickle” is the serialization mechanism — and the file storage itself may be a single file, or even a bunch of files (containing pickles) spread in a hierarchical structure. (In fact the pickles can even be stored inside Postgres, butI I digress).

Point being, if you are willing to shift your level of abstraction a little, you will recognize that a “.png” file on a harddrive is a serialized version of some sort of “image object” — which by definition means the file system could be considered to be an object store.

Let say one had a bunch of simple objects, and didn’t want to use either an “OODB” or a RDBMS — but one still wanted to persist these objects. Well one might end up choosing to serialize each object to a separate file on disk, using say a naive “key=value” text file strategy (sort of Win-Ini-ish). And maybe the file extension, file name or some other file metadata, gives some hint as to what type of object has been serialized in that file. And maybe the folder all these files live in, correspond in the code to the container object that contains all these other objects. And maybe atomic posix rename is used to replace a file if needed. And maybe the app code has to deal explicitly with loading objects from disk — or maybe the coder has made this transparent too by hooking this into class attribute access mechanisms.

Or maybe, an XML file is used as the serialization of the entire hierarchy of objects. Whether one is loading/parsing it explicitly, or using some sort of magic to make it happen on demand behind the scenes whenever the code thinks its accessing an “in-memory” object — one is still using an approach which seems fundamentally “object database”-ish.

Even if you don’t use such an elaborate serialization mechanism, you may still have eg your own type of “image object” and when you are deserializing a .png file into some attribute of your object, you are at some level performing some degree of object persistence.

What you consider an OODB, at the simplest level, simply hides some of the work involved in serializing and deserialzing objects on demand, so they seem to appear magically when needed. (And of course there tends to be other things too, like helping to do transactions atomically, helping to retain consistency etc. etc. — all in the name of cutting down the code one has to put explicitly in one’s app).

So while perhaps only few people elect to use a “brand-name OODB”, it seems to me many end up doing something related by hand — following a basic OODB-like philosophy and effectively implementing parts of one by pulling the bits of OODB-like functionality they actually need, into their app itself — for example by implementing explicit serialization/deserializtion between objects and files (maybe using something like XML).

Thus, when venturing that “the market has spoken on OODB” – it hardly seems that it really is this clear cut. Maybe not Caché or ZODB — but perhaps simply writing their apps to use the fs like an object store. And perhaps using as their fs some sort of hugely-cached-in-memory-with-mega-clustering “fs”. Point being, if they are not using RDB they are probably implicitly using a philosophy or patterns which “quacks” similar to OODB — even if they don’t call it that.

My only point was — I don’t think its fair to dismiss OODB out of hand, because if one stops to think about it, one may in fact recognize that most non-RDB DB approaches actually are dealing in terms of “storing data in a fashion from which objects can be reconstructed when they are needed” — by putting them in some fashion on some fs. Which to me sounds very much like saying that “it attmpts to persist the object basically exactly the same as it was in memory.”
roScripts - Webmaster resources and websites Says:
June 3rd, 2008 at 2:15 pm
Maglev and the naiivety of the Rails community…

Maglev and the naiivety of the Rails community…
Sho Says:
June 3rd, 2008 at 3:06 pm
Charl, thanks for your thoughtful and insightful comments – food for thought. I think we agree on more than we disagree, but here’s my response anyway:

If you want to persist an object, its impossible to just *poof* make that object appear on disk or somewhere else — instead it first has to be serialized into a stream of bytes to be stored or transmitted (and obviously de-serialized back into a memory object when you call on it — even if this happens transparently for you and you don’t have to care about it).

Yes, obviously! You need some I/O implementation. On Ruby the most common is probably YAML, and AR obviously handles a lot of translation as well. This process is standard for all types of data storage.

In the context of something like the ZODB, you can merrily set and get attributes on objects, and most of the serialization happens behind the scenes — the objects simply appear to “be there” as they are needed. Here the python “pickle” is the serialization mechanism — and the file storage itself may be a single file, or even a bunch of files (containing pickles) spread in a hierarchical structure. (In fact the pickles can even be stored inside Postgres, butI I digress).

Yes, I’m under the impression that the ZODB is very useful for storing smaller amounts of “config” style data – user accounts or site config, for example. To put it in Ruby terms, the ability to hold config in a YAML file and then just save it without having to explicitly write the file again sounds very appealing – the magic all happens behind the scenes.

Point being, if you are willing to shift your level of abstraction a little, you will recognize that a “.png” file on a harddrive is a serialized version of some sort of “image object” — which by definition means the file system could be considered to be an object store.

Er. Well, that may be true as far as it goes – if the files are objects, and they’re stored on a filesystem, then the FS is indeed an “object store”. However, the example of a normal FS doesn’t really match up with my definition of what a programmatic object store is. It’s not “active” enough, doesn’t handle the aforementioned serialisation and translation. For example, you can’t tell a normal FS to “give me the image file with this name in this directory in this format”. I think it’s too low-level. That said, some filesystems/OS’s are beginning to implement this kind of functionality – I think you can ask the OSX FS layer to, say, give you the dimensions of that image without loading it yourself – this is approaching my understanding of an object store’s level of abstraction.

Let say one had a bunch of simple objects, and didn’t want to use either an “OODB” or a RDBMS — but one still wanted to persist these objects. Well one might end up choosing to serialize each object to a separate file on disk, using say a naive “key=value” text file strategy (sort of Win-Ini-ish). And maybe the file extension, file name or some other file metadata, gives some hint as to what type of object has been serialized in that file. And maybe the folder all these files live in, correspond in the code to the container object that contains all these other objects. And maybe atomic posix rename is used to replace a file if needed. And maybe the app code has to deal explicitly with loading objects from disk — or maybe the coder has made this transparent too by hooking this into class attribute access mechanisms.

What you’re describing is what I would call a hashtable “with improvements”, like some built-in serialisation and namespacing.

Or maybe, an XML file is used as the serialization of the entire hierarchy of objects. Whether one is loading/parsing it explicitly, or using some sort of magic to make it happen on demand behind the scenes whenever the code thinks its accessing an “in-memory” object — one is still using an approach which seems fundamentally “object database”-ish.

Again, I can’t help but agree, kind of – using XML or YAML or a hashtable or whatever to store, say, an array is indeed object persistence. And array is an object and yup, it’s being persisted. But saying it’s a “database” is, IMO, not really in keeping with the common understanding of databases. Yes, a DB is at its most basic level just a file (or memory) with a bunch of stuff that’s been written/serialised/persisted into it. But when we say DB, we mean a whole bunch of other capabilties as well. You can treat XML like a DB if you’re really hell-bent on it .. but I don’t agree that XML *is* a DB.

Even if you don’t use such an elaborate serialization mechanism, you may still have eg your own type of “image object” and when you are deserializing a .png file into some attribute of your object, you are at some level performing some degree of object persistence.

Yes, at some level!

What you consider an OODB, at the simplest level, simply hides some of the work involved in serializing and deserialzing objects on demand, so they seem to appear magically when needed. (And of course there tends to be other things too, like helping to do transactions atomically, helping to retain consistency etc. etc. — all in the name of cutting down the code one has to put explicitly in one’s app).

I agree with all of this, of course.

So while perhaps only few people elect to use a “brand-name OODB”, it seems to me many end up doing something related by hand — following a basic OODB-like philosophy and effectively implementing parts of one by pulling the bits of OODB-like functionality they actually need, into their app itself — for example by implementing explicit serialization/deserializtion between objects and files (maybe using something like XML).

Well, if it’s your position that reading and writing to XML or YAML is in fact using an “OODB” then I guess we’re all using OODBs.

But that’s not what I understand to be the normal definition of an OODB. For storing small pieces of application-specific or otherwise “single” objects, this storage methodology is indeed ideal. Where it breaks down is when you want to store a *lot* of data, and more importantly, you want to be able to interoperate with other applications and/or anything that isn’t your original implementation’s exact way of doing things.

You are right in that directly serialising/deserialising objects can save time in implementation; when you don’t have to think about how to write the data, and it what form, it’s easy to see how nice that might be. The problem is that it’s too implementation-specific and non-portable. If your data storage is wrapped up in language-specific objects, how is another language supposed to read it? You’d have to implement all sorts of translation, just to get the data into a neutral format!

Take the example of the ZODB. I have never actually used it, but my understanding is that the point at which most people abandon it is when they need to implement, say, fulltext search. Now yes, they could write their own fulltext search which looks inside all their objects, or ZODB may provide some options – I don’t know. But if you want to use some other program like Solr, for example, you’re faced with the task of writing de-objectisers for all your stuff, and the time saving is forfeited.

My problem with OODBs isn’t that they aren’t cool and useful, they are. My problem is that they are implementation-specific and any gains you might have made when you “just store the objects” without putting them into some kind of standard world-readable form is surrendered when you need your data in any other way.

Thus, when venturing that “the market has spoken on OODB” – it hardly seems that it really is this clear cut. Maybe not Caché or ZODB — but perhaps simply writing their apps to use the fs like an object store. And perhaps using as their fs some sort of hugely-cached-in-memory-with-mega-clustering “fs”. Point being, if they are not using RDB they are probably implicitly using a philosophy or patterns which “quacks” similar to OODB — even if they don’t call it that.

Again, agreed – this usage pattern is overwhelmingly common. However, again, I want to point out that this use case is usually restricted to things like “config” files. When people want to store large amounts of data in a universally accessible form the market has indeed spoken, for RDBMS. Although many people actually use RDBMS’s as “object stores” by your definition anyway, rejecting the formal relational concept. If you look around this site you’ll see I am actually one of those people – on the current front page is instructions for forcing Rails to use UUIDs instead of integer serials in an RDBMS, the effect being to force the DB into more Object-Store-like functionality!

My only point was — I don’t think its fair to dismiss OODB out of hand, because if one stops to think about it, one may in fact recognize that most non-RDB DB approaches actually are dealing in terms of “storing data in a fashion from which objects can be reconstructed when they are needed” — by putting them in some fashion on some fs. Which to me sounds very much like saying that “it attmpts to persist the object basically exactly the same as it was in memory.”

Well, I agree as far as it goes. I think that the terminology here is actually quite confusing – I had a very specific thing in mind when I spoke of OODBs, I’m not rejecting the concept of “storing objects” out of hand – indeed, if I had something against storing objects at any level then I could never store any data at all.

These terms are all mixed up. When I say RDBMS, I mean “something like MySQL, regardless of whether you are actually doing anything formally relational”. When I said OODB I meant “something like ZODB” – and maybe more importantly, I meant the usage style for the programmer of “writing data into and out of a fairly standardised set of formats” (RDBMS) or “writing the object without worrying about how it’s being recorded” (OODB). This is a bit hard to describe so let me give an example.

Take the example of, say, a User belonging to 3 Groups.

In an RDBMS, one would typically have to explicitly create a groups table, a users table, and a relationships table. One would create a group membership by adding a new record to the relationships table referencing the primary keys of both the User and the Group, which remain untouched. You would look up which Group has which Users or vice versa by searching that table by the explicit primary key of the entity concerned.

In an OODB you would have only two tables, the User table and the Group one, which you do not have to explicitly create. You would create the relationship by adding arrays for members and memberships to the Group and User classes respectively, then adding each entity to the other’s arrays, which is serialised and stored seamlessly by the OODB in new versions for both those records. You look up who belongs to what by directly referencing those entities and looking in those arrays.

Am I right so far?

My problem with this is that it’s tied to implementation. With the RDBMS there’s limits on what you can store, but this is also an advantage when trying to read from outside the implementation as it provides a lowest common denominator of data types and a fairly standard way of recreating the objects and their relationships from outside the implementation. For example, say I want to look up the User’s login and password for an LDAP group authentication system after the fact – that is no problem at all, the data is right there in a standard form.

As I understand them, with an OODB I am basically screwed. My User and Group relations are stored in a form only accessible by loading them into the original programming environment. If I want to access them outside that stack, I’m going to have to write something to read them back out. I will have to use the original language since its objects are indecipherable to anything else. The OODB has no ability at all to understand those objects on its own.

I saved a little bit of work at the beginning, and it’s certainly nice to not have to think about database fields and formats and schemas, but the cost in flexibility, portability, and interoperability is crippling.

This is all “to the best of my knowledge” which is, of course, limited, and I’d also say that discussion of this matter is heavily informed by the speaker’s own personal understanding of all the terms. Rails, for example, has a number of built-in methods for doing a lot of the OODB-ish serialisation we’ve been discussing, and out of the box doesn’t use an RDBMS’s systems for establishing and enforcing relationships, preferring to recreate them internally – you could make a genuine case that Rails uses MySQL as an OODB! The terms are fuzzy and often come down to more *how* a programmer uses the tools, rather than what the tools are intended to be. The trend for RDBMS denormalisation, heavily promoted as a scaling strategy and in use by sites like Twitter and Facebook, eschews the “R” and does the exact type of serialisation of object properties that an OODB provides. The only difference is that they implement it in application code, using the “RDBMS” merely as a fast datastore.

Then you’ve got the new breed of “Document DBs”, which you’ll see me talking favourably about (at great length) elsewhere on this site – not to mention systems like BigTable. Where they fit in is anyone’s guess, they embody features of both.

Anyway, it’s an interesting topic and I thank you for your insightful comments.
JeanHuguesRobert Says:
June 3rd, 2008 at 6:30 pm
Sho, I believe you still get some remaining time to pretend that you were joking, but not much.

Gemstone is the best thing that has happened to Ruby in years (Ruby 2.0 is … second). Period.

Specially after the disappointing various attempts to improve Ruby highly problematic speed.
Sho Says:
June 3rd, 2008 at 8:07 pm
JeanHuguesRobert:

That remains to be seen. I’ll sing Maglev’s praises after it’s been released and its worth proved.

Until then, I’ll remain skeptical – an entirely reasonable position, I should think.
Jon E Rotten Says:
June 4th, 2008 at 1:05 am
Listen, before you make sarcastic remarks about things you don’t understand, go and do some reading. I understand your skepticism; there’s way too much crap out there for anyone to believe everything they hear.

For your info, I wasn’t at RailsConf, and I didn’t drink any of the koolaid. I don’t give a shit about Rails; it’s nothing new or revolutionary and I’m not a Rails weenie. I am just reading stuff and making inferences based on my own experiences and the materials available. Seaside is a Smalltalk-based web framework that makes HTTP *seem* stateless to the developer. Your objects are all living breathing objects that don’t need to be saved to a DB or stuffed in some session object that gets saved somewhere between each request. Everything is just there, all the time, for you to use. *THIS* is what MagLev will bring to the Ruby (and therefore Rails) community. The idea that a relational DB is not necessary any more (although it you may want to use one still, depending on your needs) and that you can add more storage space by simply adding another VM is what’s new. They’re trying to *change* the mindset. That’s what the big deal is.
SmalltalkGuy Says:
June 4th, 2008 at 1:07 am
Ruby “guys” should finally admit, that Ruby is just a cheap Smalltalk rip off, implemented in a way Smalltalk was done more than 30 years ago.
Without Smalltalk implementations making JIT’s really working, Java (for example) would not be there where it is today.
Sun’s HotSpot JIT is basically based on a JIT that was running Smalltalk.

It’s just a fact, that the standard Ruby implementation is a simple interpreter that just cannot compete with a JIT that spits out dynamically (based on profiling the running code) optimized machine code.

I have no doubt that Maglev works as promised.
Sho Says:
June 4th, 2008 at 1:37 am
Jon E Rotten:

Listen, before you make sarcastic remarks about things you don’t understand, go and do some reading.

Nice opener, really puts me in the mood to listen carefully.

I understand your skepticism; there’s way too much crap out there for anyone to believe everything they hear

You agree with my skepticism? So what was the first sentence about things I “don’t understand”?

Seaside is a Smalltalk-based web framework that makes HTTP *seem* stateless to the developer. Your objects are all living breathing objects that don’t need to be saved to a DB or stuffed in some session object that gets saved somewhere between each request. Everything is just there, all the time, for you to use. *THIS* is what MagLev will bring to the Ruby (and therefore Rails) community. The idea that a relational DB is not necessary any more (although it you may want to use one still, depending on your needs) and that you can add more storage space by simply adding another VM is what’s new. They’re trying to *change* the mindset. That’s what the big deal is.

Well that’s wonderful and I’m looking forward to seeing what they come up with, but it’s hardly Rails, is it? If they’ve come up with a new Ruby web framework that’s great, I look forward to checking it out.

I was talking mainly about their speedup claims, and the other criticism (OODBs, magic persistence) is all in a Rails context. If they’re porting their entire Seaside stack to Ruby, wonderful, I’ll check it out when it’s released – but we are talking about a RAILS presentation at RAILSCONF. You remember where he said we won’t need ActiveRecord anymore? ActiveRecord is PART OF RAILS.

I have a very open mind towards new web frameworks, especially in Ruby – I’ve been playing around with Merb for some time now, and a couple of others. I hadn’t heard of Seaside before the Maglev presentation, I’ll certainly check it out more, but I have the feeling that if it’s really as revolutionary as some (like yourself) make it out to be, it’d be more popular than it is. Still, I’ll check it out, I’m not a zealot.

Please keep a lid on the accusations and misunderstandings. Seaside may indeed be the world’s best kept secret in web frameworks – I doubt it, but don’t deny the possibility. However anything I said is in the context of Rails and Rails alone; anything else is a product of your imagination.
Sho Says:
June 4th, 2008 at 1:53 am
SmalltalkGuy:

[ I edited my response to this comment, after realising I'd misunderstood a couple of things ]

Ruby “guys” should finally admit, that Ruby is just a cheap Smalltalk rip off, implemented in a way Smalltalk was done more than 30 years ago.

Are you seriously saying that all current Ruby implementations are 30 years behind Smalltalk?

Why don’t you take a look at some actual benchmarks. You will notice that I have not sorted or cherry picked that data at all – that is a list of all performance data for all languages. I see the best performing open source smalltalk (VisualWorks) outperforming Ruby 1.9 by a factor of around 3. You talk like Smalltalk has some insurmountable technical lead from its “30 years head start” – a quick look at the raw performance data and your claim looks laughable. Sure, a 3x improvement is great. But it’s nothing like the numbers Gemstone were throwing around.

Scroll a little further down and you can see the two other Smalltalk implementations on that list (Squeak and GNU) – slower than Ruby 1.9, though still faster than 1.8.6.

Here is the specific comparison of Ruby 1.9.0 vs. Smalltalk VisualWorks. Where is this huge technological lead?

It’s possible, of course, that Gemstone’s proprietary implementation is far, far in advance of every other Smalltalk VM – but you’re claiming Smalltalk VMs in general, anything to do with Smalltalk in fact, has this huge lead – and I just don’t see it, not in these tests anyway.

And making the ridiculous claim that it’s a “cheap rip off” says more about you than anything else.

Without Smalltalk implementations making JIT’s really working, Java (for example) would not be there where it is today.
Sun’s HotSpot JIT is basically based on a JIT that was running Smalltalk.

Well, that’s great. Is there a point coming soon, other than that you evidently have a huge chip on your shoulder?

It’s just a fact, that the standard Ruby implementation is a simple interpreter that just cannot compete with a JIT that spits out dynamically (based on profiling the running code) optimized machine code.

The sub-par nature of the current MRI interpreter is a rare point of universal consensus in the Ruby world, hence the six current competing implementations, Maglev one of them. I agree there is a lot of scope for improvement, and the JIT compile-to-bytecode idea could prove fruitful. I just doubt that Gemstone can implement it so many times faster than anyone else. I could be wrong, but I’ll believe it when I see it. The presentation at Railsconf was not seeing it. Reasonable, no?

I have no doubt that Maglev works as promised.

No doubt at all, huh. Must be nice to live in your world where there’s no such thing as unrealistic pre-release hype, vapourware, misleading benchmarks, products not reaching prerelease expectations.

In my world, when a company makes a splashy presentation well in advance of release making claims of huge advances over all previous implementations, I want to see proof.

After I see the proof: big fan, effusive praise, valued customer.

Before the proof: vapourware.

Understand?
Mark Coates Says:
June 4th, 2008 at 2:11 am
Sho, et al.–

This is probably the best blog post and comment collection I have read in a long time. Thanks for everyone’s contributions and thoughts. I learned quite a bit throughout the reading.

Sho, I have to agree and laud your skepticism. While it is not impossible, as you once asserted (if memory serves), the MagLev team’s claims are suspect and should be treated as such until the community has seen real proof that lives up to the standards put forth by reasonable parallel efforts being made in the Ruby VM space.

The overboard-jumping from some Rails bloggers has been fishy… I mean, let’s not proclaim the Age of Human Enlightenment and Miracles because of some slick marketing hype during a corporate-sponsored presentation at a trade show. I mean, yes, I want to believe… but first, let’s take a breath. I think your post helped me do that, while at times infuriating me.

Anything that can happen to Ruby and/or Rails to bring more adoption in the Enterprise or from the development communities is ultimately good. It’s good for Ruby and it is good for the developers who make a living with Ruby and/or Rails. The different implementations, while they exist in the same ecosystem, tend to serve different niches, and that makes for more survivability for the Ruby movement in general. So this in-fighting and burn-me-at-the-stake belief and evangelistic fervor for the Great One Implementation in The Sky is ultimately asinine, IMO.

This is a multiverse of abundance, especially in our Information Science realm… There is room and purpose for all. Don’t prune the tree before the branches have a chance to thrive.
Wincent Colaiuta Says:
June 4th, 2008 at 3:30 am
Speaking of hype, I see Phusion Passenger has decided to release 2.0, all of several weeks after the initial release. Expect 3.0 next month.
Sho Says:
June 4th, 2008 at 3:46 am
Mark, thanks for your kind words. It’s been an illuminating conversation for me too! I agree 100% with everything you said. Sorry for being a bit infuriating .. I, uh, tend to go a little overboard myself sometimes.

Even if this Gemstone product comes to nothing, it’s still been a very welcome boat-rocking for the community, I think. For example, the main guy behind JRuby has mentioned that he’s going to rededicate himself to performance tuning – nothing like a challenge to spur things along. And this meme that the lengthy Smalltalk experience contains valuable lessons for Ruby implementation, while seemingly coming as a big surprise to everybody, will surely precipitate a thorough investigation, which at worst will lead to some closure on the matter and at best a better, faster Ruby.

As for me, I echo your sentiments exactly – new challenges, new contenders, more options, can only strengthen the community as a whole and spur it to greater heights. If Ruby is improved, no matter the source of that improvement we all benefit. Sure, the discussion can have elements of acrimony, as all such discussions do – but the net effect can only be positive.

I just wonder if Gemstone is now regretting “letting the cat out of the bag” that Ruby implementations have some valuable things to learn from Smalltalk VMs so early .. now the idea has well and truly sunk into the Ruby community, and if it turns out to be true, who’s to say the community won’t throw itself into implementation and have finished before Maglev is even ready ..!

Thanks again.
Sho Says:
June 4th, 2008 at 4:26 am

Speaking of hype, I see Phusion Passenger has decided to release 2.0, all of several weeks after the initial release. Expect 3.0 next month.

I think you’re missing the real news here – Ruby Enterprise Edition will be available soon! Finally. I’ve been waiting for a version of Ruby with “enterprise” in the name for years. As seen on Soocial (not joking but wish I was).

Apparently the holdup is because they don’t have good enough internet access to push Ruby Enterprise Edition to Github. Again I am not joking, but wish I was.
Priit Tamboom Says:
June 4th, 2008 at 3:27 pm
I agree with Sho’s blogpost. It’s generally good there are coming more options like Maglev, but it’s huge difference between ability to see code yourself versus to listen some hype-promises from closed source vendor.
Avdi Says:
June 5th, 2008 at 1:17 am
What you seem to be missing is that what all of the alternative Ruby VM implementers have been saying up till now re: optimization boils down to “we know we have barely scratched the surface, optimization-wise; but we know that the Smalltalk community has some really advanced techniques for optimizing dynamic languages, and we hope to apply those techniques once we are 100% Ruby-compatible”. Saying that JRuby, Rubinius, et al are *currently* focused on performance optimization indicates that you haven’t been following their development blogs much. Both Charles Nutter and the Rubinius guys have all been pretty up-front about a) feature-completeness is their goal now and b) optimization comes second. Although since JRuby has been able to run Rails that team have been able to spend more time on performance optimization.

Also, I’m not sure what you mean by Smalltalk being much less dynamic than Ruby. You do know that Ruby is essentially Smalltalk with a Perlish syntax, right?

So if the MagLev team are executing Ruby code on a mature optimized Smalltalk VM, I don’t have any trouble believing that they’ve seen massive improvements in speed. These are the same improvements that the JRuby, Rubinius (especially Rubinius, which is modeled directly on the Smalltalk implementation), et al are all expecting to see once they are able to focus on optimization. MagLev is just already there, because they don’t have to re-write the Smalltalk-style optimizations form scratch.
Benjamin Jackson Says:
June 5th, 2008 at 4:18 am
You left out MacRuby (though whether or not it’s a “serious, credible, working Ruby implementation” yet is up for debate).

Worth checking out anyway: http://ruby.macosforge.org/
Sho Says:
June 5th, 2008 at 5:24 am

What you seem to be missing is that what all of the alternative Ruby VM implementers have been saying up till now re: optimization boils down to “we know we have barely scratched the surface, optimization-wise; but we know that the Smalltalk community has some really advanced techniques for optimizing dynamic languages, and we hope to apply those techniques once we are 100% Ruby-compatible”. Saying that JRuby, Rubinius, et al are *currently* focused on performance optimization indicates that you haven’t been following their development blogs much. Both Charles Nutter and the Rubinius guys have all been pretty up-front about a) feature-completeness is their goal now and b) optimization comes second. Although since JRuby has been able to run Rails that team have been able to spend more time on performance optimization.

My point was that feature completeness is not some abstract goal which might be considered an alternative to the goal of performance, it is a fundamental prerequisite before you can even start talking about speed.

As has been pointed out (pretty forcefully, heh) I am far from an expert on the complex business of writing interpreters/compilers/preprocessors/garbage collectors/VMs. However, I do have a little experience in software development and like to think I’m aware of its basic nature. It seems to me that unless JRuby (and, more to the point, YARV) started from (and still use) some fundamentally flawed, obsolete theory of language interpreters/VMs, then the idea that there are these massive uncollected performance gains just waiting for someone to harvest by implementing a more mature VM is hard to believe.

At the time of writing, I’d been reading other bloggers splashing numbers like 15x-60x performance gains around. That kind of gain from a tweaked VM is ludicrous on its face. 60x faster would be faster than C! Again, I’m not an expert – but if I hear someone claiming they can make Ruby faster than C, well, I’ll believe it when I see it.

Also, I’m not sure what you mean by Smalltalk being much less dynamic than Ruby. You do know that Ruby is essentially Smalltalk with a Perlish syntax, right?

I shouldn’t have used the word “dynamic”. I was trying to refer to the syntax and general behavioural spec of Ruby, which is notoriously difficult to interpret and was, I thought, pretty resistant to preprocessing. Maybe a better word would have been “complex” or “flexible”.

So if the MagLev team are executing Ruby code on a mature optimized Smalltalk VM, I don’t have any trouble believing that they’ve seen massive improvements in speed. These are the same improvements that the JRuby, Rubinius (especially Rubinius, which is modeled directly on the Smalltalk implementation), et al are all expecting to see once they are able to focus on optimization. MagLev is just already there, because they don’t have to re-write the Smalltalk-style optimizations form scratch.

I also don’t have any difficulty believing that some impressive speed gains are possible once the VM is mature and optimised. YARV, for example, is around twice as fast as MRI and it’s very possible that Maglev could be even faster.

What I do have trouble believing is a 15 or more times gain. That would mean that Ruby was now comparable speedwise to Java, and I don’t think anyone will argue with me if I say Ruby’s more dynamic than that. That magnitude of improvement is unheard of in any software development project I’m aware of. I’m not going to say it’s absolutely impossible – but again, I’m going to need to see it to believe it.

It is entirely possible they’ve seen massive improvements in speed – in certain cases, and no doubt it was those cases which were shown off at Railsconf. But the performance of a language is based on across the board speed, not some little tweak here or there. As I mentioned above, Ruby 1.9.0 today is about 27x slower than C. The fastest smalltalk is about 9x slower than C. You might be able to find some neglected backwater here or there and optimise it until it’s 15x or even 60 times faster – but across the board?!

I can’t get my head around this point. Even in this comment thread you’ve got people who seem to believe that Maglev is going to make Ruby twice as fast, or in the worst case only twice as slow, as C. I wonder if I live on the same planet as anyone who can believe that without some pretty stunning proof. Even better, apparently some benchmark maxed at out 110x faster – if so, maybe we should start thinking about re-implementing C in Maglev Ruby, the new fastest language of all time by a factor of four.

More to the point, showing off these ultra-fast benchmarks results before the language, with all its nuances, is even close to being fully implemented is really irresponsible and distasteful to me. Incomplete benchmarks in isolation are useless, everybody knows it – I know it, you know it, they know it. But knowing that – they showed them anyway. Why? It’s a marketing tactic, nothing more. It reminds me pretty strongly of other shoddy marketing tactics I’ve seen – particularly in the computer game industry – and doesn’t inspire faith in the company one bit. Worked pretty well, though, I have to admit.
SmalltalkGuy Says:
June 6th, 2008 at 1:32 am
Sho,
You ask
“Here is the specific comparison of Ruby 1.9.0 vs. Smalltalk VisualWorks. Where is this huge technological lead?”

I still see a big lead in benchmarks where really the VM is running the code.
Other results are better for ruby because there’s more C code involved.
Of course you can can speedup “your” language by implementing almost all of your libraries in C, but that limits you in extending your libraries.

Who said that in (all) real world applications Ruby could be 15(insert your factor here) times faster?

Also VW was not always the fastest Smalltalk implementation on the planet. Animorphic Systems had an implementation in the 90’s that was at least 5 times faster. Those guys went to Sun. You can see the result of their knowledge by comparing the Ruby results against Java. Java is still in certain cases more than 100 times faster than the (not yet released) Ruby 1.9.
Sho Says:
June 6th, 2008 at 1:54 am
SmalltalkGuy,

(Deep sigh)

I still see a big lead in benchmarks where really the VM is running the code.
Other results are better for ruby because there’s more C code involved.
Of course you can can speedup “your” language by implementing almost all of your libraries in C, but that limits you in extending your libraries.

OK, I agree with you there. It’s better, generally, to pursue performance as much as possible within the context of the VM. More extensible, more centralised, less “moving targets”, more pure in general.

However, it’s the results that count, native C or VM or whatever. The vast majority of people do not need to extend Ruby in a way that would conflict with its mode of implementation.

And personally, the fact that the C implementation is fully open source is, to me, a powerful vote back in its favour. It may well be more difficult to extend some library if it’s implemented in C – it might also be completely impossible to do so on a proprietary VM like the one announced by Gemstone.

Who said that in (all) real world applications Ruby could be 15(insert your factor here) times faster?

The whole reason I wrote my article is because a great many people were claiming or implying exactly that. Gemstone’s presentation implied gains of between 15x and 60x. If they’d just claimed 2x or 3x or something, there wouldn’t have been such an explosion of hype and I wouldn’t have been incensed enough to write this post!

Also VW was not always the fastest Smalltalk implementation on the planet. Animorphic Systems had an implementation in the 90’s that was at least 5 times faster. Those guys went to Sun. You can see the result of their knowledge by comparing the Ruby results against Java. Java is still in certain cases more than 100 times faster than the (not yet released) Ruby 1.9.

[Citation Needed]

Here we go again with Smalltalk folks claiming huge performance numbers from “secret Smalltalk techniques”. “At least” 5x faster across the board would make Smalltalk, a dynamic language, faster than the most recent implementation of Java, which is statically typed!

Give me a break. Point to some hard numbers from a reliable source that back up your assertions or I’m going to continue with my assumption that you’re pulling these “facts” straight out of thin air.
Juixe TechKnow » MagLev Says:
June 6th, 2008 at 7:00 am
[...] Maglev and the naiivety of the Rails community [...]
More on Maglev » Andrew Rollins Says:
June 6th, 2008 at 4:27 pm
[...] Maglev and the naiivete of the Rails community [...]
SmalltalkGuy Says:
June 6th, 2008 at 5:59 pm
A first maglev result :

http://antoniocangiano.com/2008/06/05/maglev-handles-trees-like-a-monkey/

Maglev almost as fast as C/C++.

For a Java to ruby comparison just select the right menus from the web page you cited yourself :
http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=all&lang=yarv&lang2=java

Java beeing around 222 faster on the mandelbrot test and more than a 100 times faster on other tests.
Sho Says:
June 6th, 2008 at 8:42 pm
SmalltalkGuy:

Well, initial thoughts about that benchmark is that it’s almost a perfect test of what’s bad about MRI. Object allocation and GC are famously inefficient in MRI and that script does practically nothing else; one could hardly find lower hanging fruit. Charles Nutter and “Ralf” make insightful comments along those lines on that post.

Futhermore, he leaves out the result for YARV, which is over 3 times faster than MRI on that test. I can’t understand why, Ruby 1.9.0 is the benchmark for “official” Ruby implemented with performance in mind.

Here’s the full list of results from the debian “shootout”. As you can see, Smalltalk (VW) does extremely well on that test; a mere 2.8 times slower than the fastest result, which is Java. Obviously the test favours a VM over “real” MM, and as discussed those speeds do not carry over into the general testing results.

That said, it is pretty impressive. Some of my own long-running AR import/dump scripts would benefit greatly from that kind of optimisation. I guess we’ll have to wait for the full tests to be released.

Wish I could just run the tests myself. I’m curious and *want* to do it. I hate this closed source BS where we’re all waiting on the word from one “official” tester.

UPDATE: I could swear those rankings now look quite a bit different than they did a day or two ago.
JeanHuguesRobert Says:
June 7th, 2008 at 4:11 am
“JeanHuguesRobert Says:
June 3rd, 2008 at 6:30 pm

Sho, I believe you still get some remaining time to pretend that you were joking, but not much.

Gemstone is the best thing that has happened to Ruby in years (Ruby 2.0 is … second). Period. ”

Sho. You are cooked by now. I warned you. Sorry. http://www.chadfowler.com/2008/6/5/maglev

From now on I suspect that the Ruby community will use the expression “It’s a Shoims” to point out an unjustified excess of skepticism about Ruby’s bright future.

You’re famous now. Kudos.
Wincent Colaiuta Says:
June 7th, 2008 at 6:45 am
Sho, your fundamental argument is correct and fairly well put. But in responding to these comments you’ve fallen into the trap of discussing mere technicalities which aren’t relevant and which for which full information isn’t yet available anyway.

What are the facts that it doesn’t take a doctorate in compiler construction, or inside information about this closed-source, unreleased product, to appraise at this stage?

One: the product is vaporware. It doesn’t matter how good it is when it’s completed and released; the fact is that right now it’s not completed and it’s not released. Let’s talk again when it is actually done and out.

Two: it’s not Ruby yet. It doesn’t matter how fast it is until it can actually run everything that MRI can, and in the same way that MRI can. Offer something that is “95% Ruby” (or any other percentage less than 100%) and it’s simply not interesting, no matter how fast it runs.

Three: microbenchmarks are irrelevant to me as a Rails application developer; show me real-world application benchmarks with real-world data sets. There aren’t any such benchmarks yet, and they’re won’t be either until this thing is actually Ruby and Rails compatible.

Four: the magnitude of the benchmarks does strain the limits of credulity, probably because they’re microbenchmarks. If it sounds too good to be true, it probably is.

Five: a particularly vocal segment of the Ruby/Rails community has gone into its habitual delirium at the announcement of the newest and shiniest bauble, and is ready to entrust all of its production eggs to this new basket which was apparently whipped up in just three months.

There is only one reasonable conclusion from a rational assessment of all these facts: that while Maglev is certainly interesting and it may be quite promising, perspective is called for and people should be adopting a conservative “wait and see” attitude rather than a “let’s piss our pants with excitement” one.

There end the facts and the rest is all speculation and subjective opinion. For me, my own speculation and subjective opinion is that I am doubtful about this thing attaining 100% Ruby equivalence any time soon, I am sceptical about the real-world performance benefits compared to alternatives like YARV and JRuby (and I’m conservative in my deployment choices so I am not even thinking of switching to those from MRI at this stage, let alone switching to this newest kid on the block), and I’m not willing to sell-out to a commercial, closed-source platform in exchange for as-yet-unknown performance improvements, no matter how large they may turn out to be in the end.
Sho Says:
June 7th, 2008 at 6:48 am
JeanHuguesRobert:

I have to admit I’m pretty surprised to see that post. What the hell is Chad Fowler doing pitching for Gemstone?

From now on I suspect that the Ruby community will use the expression “It’s a Shoims” to point out an unjustified excess of skepticism about Ruby’s bright future.

That is completely untrue and shame on you for saying otherwise. I love Ruby and no-one is more excited than I about Ruby’s bright future than me. Take a look around this blog, why don’t you. I have embraced Ruby wholeheartedly and contribute all I can to the community.

However, I also believe that future is free, libre and open source. I also believe wholeheartedly in Open Source software and am more than a little dismayed to see so many people apparently forgetting about its principles in their headlong rush to endorse this product which apparently runs counter to most of them. I find it suspicious and more than a little perverse.

Right now, Gemstone’s Ruby implementation is little more than a trialware binary which happens to perform decently on a few specially selected “thought leader”’s computers. It is more closed than Microsoft’s IronRuby and every statement about its performance, including Chad Fowler’s, must be taken on trust. I find this situation to be pretty regrettable.

We shall see if the product lives up to its claims and is such a leap forward in performance that it’s worth abandoning Ruby’s open source heritage to use it. For now, though, I stand by my skepticism regarding this trialware, vapourware, incomplete, closed-source commercial implementation.
Sho Says:
June 7th, 2008 at 6:52 am
Wincent Colaiuta:

Right you are on every single one of those points. I’m beginning to wish I’d followed your wisdom in not allowing comments on your blog! On the plus side, it’s been educational and someone had to come out and say it.

A Ruby implementation is 100% compatible with MRI or it’s no Ruby implementation at all. For the time being, Maglev is not a Ruby implementation and until it is, this discussion is over.
Ruby is the R in Rails Says:
June 11th, 2008 at 5:47 pm
[...] A damned fine presentation I’m told but Charles Nutter of JRuby is not convinced and other people are already dismissing outright Maglev’s claim to scale. 100 times performance improvement is [...]
Kragen Javier Sitaker Says:
December 17th, 2008 at 8:52 am

It seems to me that unless JRuby (and, more to the point, YARV) started from (and still use) some fundamentally flawed, obsolete theory of language interpreters/VMs, then the idea that there are these massive uncollected performance gains just waiting for someone to harvest by implementing a more mature VM is hard to believe.

I think that’s a pretty accurate summary of the state of affairs. YARV started from and still uses a fundamentally inefficient, obsolete theory of language interpreters, so there are these massive uncollected performance gains just waiting for someone to harvest by implementing a more mature VM. Non-JIT interpretation has been obsolete for more than ten years now, but people outside academia mostly don’t know that yet. Hopefully Maglev, v8, SquirrelFish Extreme, HIPE, and TraceMonkey will bring that into the mainstream now, the way Perl (and later Java) did for garbage collection. (Malloc/free had been obsolete for most applications since 1982.)

JRuby is crippled differently: although its fundamental model is valid (compile to an intermediate language which then gets JITted to machine code), it compiles to bytecode for a JIT designed for a language with a very different type system, and they don’t have the freedom to change the JIT or the bytecode format. This is an argument in favor of open source, obviously.

Gemstone is a high-end Smalltalk system; you haven’t heard of it because they make their money by extracting large amounts of money from a small number of customers. Progress ObjectStore is still widely used, but mostly as what Martin Fowler calls an ApplicationDatabase, not an IntegrationDatabase — largely inside CAD systems and the like. ODBMSs haven’t been as much of a success as the Statice guys were hoping when they started ObjectStore, but they certainly haven’t suffered the resounding market rejection you paint.
Sho Says:
December 17th, 2008 at 12:58 pm
Kragen, thanks for the new comment on this old post.

Well, what you say about the interpreter is true enough, even YARV is far from the state of the art. It’s possible that a more modern VM will indeed produce large gains. The argument is how large, while still maintaining full compatibility?

We don’t know, and the product is still vapourware. Recent benchmarks indicate it’s around twice as fast as MRI in current alpha stage. Hardly a pants-jizzing improvement, then.

The way some of the above “experts” talk, Ruby is such a minor variation on SmallTalk that the adaptation of Gemstone’s VM to implement it is barely an afternoon’s work for those hitherto-unsung geniuses. Well, six months later, what do we have? A half-working alpha which is slower than YARV.

Now, it is certainly possibly that they will manage to eke out further gains, if they ever do finish it. But let’s remember the atmosphere of unquestioning hype which surrounded their premature announcement, and which prompted this highly skeptical post. Let me quote from one of the articles I linked to:

we got to see some preliminary performance data that showed an order of magnitude or TWO increase over MRI 1.8

This is the reason I was so negative in my post. These Gemstone guys got up on stage, showed a couple of completely irrelevant micro-benchmarks that showed their incompatible alpha running some tiny subset of something, and obviously left people with the impression that their VM would be one or two orders of magnitude faster than MRI.

I’d like to point out I didn’t say it was completely and utterly impossible. I just said I don’t believe a fully compatible implementation will be able to reach anything like those speeds, and I would require a pretty high standard of proof before I abandoned that position. 6 months later, that proof has not arrived, and in fact I seem to be vindicated somewhat by the yawn-inducing recent benchmarks, which show a massive speedup on some tasks but a far lesser increase across the board.

Still, we’ll wait and see. To quote myself: A Ruby implementation is 100% compatible with MRI or it’s no Ruby implementation at all. For the time being, Maglev is not a Ruby implementation and until it is, this discussion is over.

Your explanation of Gemstone’s business model of locking a small number of customers into their proprietary system and then screwing them for all they’re worth does not exactly fill me with love.

Regarding your comment about ODBMSs: Funnily enough, I am actually quite a fan of them. For interests’ sake I actually wrote a kind of mini-ODBMS earlier this year trying to store object graphs around the place in some kind of ordered fashion, basically marshalling in and out of CouchDB with a neat object linking system. It was great fun. However, your comment about not suffering market rejection is disingenuous.

When I talked about DBs, I believe it was quite self-evident I was referring to shared data repositories – what Fowler called an IntegrationDatabase. An ApplicationDatabase is a much more specialised, varied, and usually small-scale, animal. Yes, applications store their state, and that state is often in the form of objects, whether marshalled or stored in an ODBMS. There are, indisputably, many such applications, and many such implementations. However, they’re not counted in the IntegrationDatabase world I was talking about. I wouldn’t have thought it necessary to even draw that distinction, but there you are.

Anyway, in summary, I have not yet seen any proof, or even any good evidence, that Gemstone’s claims were, or will be any time soon, realistic. I retain my distaste for the way the company very deliberately hyped its vapourware product, and I stand by my disappointment of the Rails community’s unquestioning acceptance of that hype. However, it was a while ago now, everyone’s moved on, and I think Gemstone will really only hurt themselves in the end, so whatever.
Wincent Colaiuta Says:
December 17th, 2008 at 7:05 pm
Yeah, Sho’s vindicated and “Malloc/free had been obsolete for most applications since 1982″ is patently false. There are a lot of current, great applications that use malloc/free. Of the 15 apps I have running on my desktop right now, pretty much _everything_ is a malloc/free beast. All of these are modern, high-performing, useful applications. The one exception is a Java app. Funnily enough, it is the slowest, most bloated, and ugliest of the lot.
Beoran Says:
December 17th, 2008 at 7:17 pm
Despite the storm caused by this blog post. I must say I generally agree with the gist of it. Free software comes first. Even if Maglev turns out to be a hundred times faster than MRI, I won’t go anywhere near Maglev unless it’s source is released under a free software license. Until that happens, Maglev is of zero interest to me.
Sho Says:
December 17th, 2008 at 10:57 pm
Wow, I’m amazed anyone is still watching this old post. If I’d know it would get any kind of attention, I would have written it a lot better. I especially wish I could get rid of that crap about the DBs, which was pretty unnecessary and just distracts from the main thrust.

Vindicated? Well, maybe a little, but the thing is if I claim vindication I’m committing the exact same offense I railed against – drawing conclusions from inadequate data. Time will tell.

I defer to Wincent’s analysis of the situation regarding GC’ing languages. In that area, not only do I not know what I’m talking about, I don’t even pretend to, unlike Ruby

Beoran: I agree completely. I will never lock myself into some proprietary system. Anyone who developed anything using the whole Maglev toolchain would be chained to it for life. This is so against my principles, and what I believe the Ruby principles to be, that I also don’t care about any other factors, I’m not using it. There are very few exceptions to this rule.

Non-portable Ruby apps running on a (binary-distributed) proprietary VM – which you got from a company that makes “their money by extracting large amounts of money from a small number of customers.” Gee, where can I get in line for that dream come true?

Sho Fukamachi Online

Maglev and the naiivety of the Rails community

78 Responses to “Maglev and the naiivety of the Rails community”

Leave a Reply