Here’s my initial stab at a Rails Session model for CouchDB. The marshalling stuff is taken from the example SQLBypass class in the ActiveRecord code.
You’ll need a recent and trunk CouchDB, probably.
class CouchSession < Hash @ = CouchRest.database!('http://localhost:5984/sessions') attr_writer :data def self.find_by_session_id(session_id) self.new(@.get(session_id)) rescue self.new(:id => session_id) end def self.marshal(data) ActiveSupport::Base64.encode64(Marshal.dump(data)) if data end def self.unmarshal(data) Marshal.load(ActiveSupport::Base64.decode64(data)) if data end def initialize(attributes = {}) self['_id'] = attributes['_id'] ||= attributes[:id] self['marshaled_data'] = attributes['marshaled_data'] ||= attributes[:marshalled_data] self['_rev'] = attributes['_rev'] if attributes['_rev'] end def data unless if self['marshaled_data'] , = self.class.unmarshal(self['marshaled_data']) || {}, nil else = {} end end end def loaded? !! end def session_id self['_id'] end def save self['marshaled_data'] = self.class.marshal(data) self['data'] = data self['updated_at'] = Time.now save_record = @.save(self) self['_rev'] = save_record['rev'] end def destroy @.delete(self['_id']) end end
Nice and short – possibly the shortest Rails session class I have seen. The beauty of CouchRest/CouchDB! And we descend from hash so we can just save the object straight – after marshalling, of course. Cool, huh?
Note that I am actually writing the raw data as well as the marshalled data into the saved doc, for troubleshooting/interest purposes. Feel free to remove that.
Not pretty, but it works. Just save it like a normal model. You’ll need to put these into environment.rb:
config.action_controller.session_store = :active_record_store CGI::Session::ActiveRecordStore.session_class = CouchSession
Note also that I have ignored any differentiation between the record ID and the session ID, negating the need for any special overrides in ApplicationController. However, the session IDs Rails generates are large and you might find them unattractive in CouchDB – it would be fairly simple to separate them, but then you’d need a new map view and an override. I feel it’s simpler to just use the Session ID as the doc ID and damn the torpedoes. YMMV.
Improvements? See something wrong with it? Let me know!
September 19th, 2008 at 9:42 pm
What’s performance like?
September 20th, 2008 at 12:23 am
Well, similar to any other CouchDB access I guess. I would imagine any performance issues in the above code arise from the data marshalling, but since that’s integral to Rails’ session handling I didn’t mess with that.
I don’t know any way of easily benchmarking the entire session creation/access/deletion loop from within Rails, if you know of one I can run it for you.
Apart from that, access is consistent with any CouchDB interaction – decently fast. However, for individual accesses, CouchDB is much faster reading. Single-threaded write of single records does not perform as well, partially due to HTTP overhead (POST or PUT) but also depending on disk speed – not helped by this laptop’s crappy HD.
Anyway I realised I’d never actually benchmarked CouchRest so here’s a brief speed test I whipped up. It simulates the approximate record size of a Rails session, ie. 3 fields and about 1K of data.
On the laptop:
couchdb/couchrest speed test
number of docs: 500
== WRITING ==
time to write docs: 85.907737
time per doc: 0.171815474
docs per second: 5.82019754518734
== READING ==
time to read docs: 3.824564
time per doc: 0.007649128
docs per second: 130.733856199033
## finished ##
Boy, that was even slower than I thought it would be. Worrying slow, in fact. I’m going to run this test on the server and see what that gets.
September 20th, 2008 at 12:28 am
BTW i just ran the same test on a machine with a decent disk:
couchdb/couchrest speed test
number of docs: 500
== WRITING ==
time to write docs: 4.603314
time per doc: 0.009206628
docs per second: 108.617400420653
== READING ==
time to read docs: 1.314841
time per doc: 0.002629682
docs per second: 380.27411679435
## finished ##
Speaks for itself. That’s much more in line with my prior experience. Whew, scared myself for a second there
Don’t know what’s wrong with this laptop, the disk is absolutely slow as fuck. Maybe I need to clear some stuff off it and defrag or something…
September 20th, 2008 at 12:34 am
Code I used for the speed test:
September 20th, 2008 at 5:00 am
I think the way to test the speed of this as a session backend would probably be to set up a brand new rails app with a controller action that does pretty much nothing other than store something in the session (ie. a flash) and then hit that action with ApacheBench. So you’d run two or more tests: one with your Couch backend and others with the standard Rails sessions stores (disk, MySQL, cookies etc).
September 20th, 2008 at 5:21 am
If you write it, I’ll run it : )
I can already tell you the result, though, which is MySQL beating couch by a mile. Individual record read/write performance is not CouchDB’s strength. It is competent, but I doubt it can hold a candle to MySQL. No-one uses files, so I don’t think that is worth testing, and cookies are an exceptional case – the main problem with them is the latency they cause, which will be hard to simulate in-machine.
Good idea though, I might try it out when I get a chance, foregone conclusions notwithstanding.
September 20th, 2008 at 5:43 am
Cookies are hardly an exceptional case; I think the cookie-backed session store is the default since Rails 2.0, isn’t it?
September 20th, 2008 at 9:28 am
Exceptional as in not like the others. Cookies are a storage option whose drawbacks only manifest themselves fully over slow internet links. That makes any local “speed test” futile.
September 20th, 2008 at 4:33 pm
Hm, I don’t know about that. My understanding is that the Cookie store was adopted as the default precisely because it’s so much faster than the others. And when the net link is slow enough to impact the tests it would drown out the relative speed differences of the backends to the point where the tests would be meaningless, so I don’t think you should even worry about that scenario.
So I would expect the speed order to be something like (from slowest to fastest): Couch, filesystem, MySQL, cookies. Of all of those I’d expect cookies to scale the best and filesystem to scale the worst (imagine 100,000 sessions: 100,000 files).
September 20th, 2008 at 5:05 pm
We’ve talked about this before, I think. I do not share Rails’ love of cookie sessions.
As my examples above mention, I find myself writing session files of approximately 1.5K. That is not a lot of data once it’s been through marshalling – a couple of preferences, the flash hash, some bits and pieces. 1.5K, not much for a DB.
If you store sessions in a cookie, that cookie is sent with *every* request to the page. Every image, CSS, every JS script, every AJAX call. What’s the average size of a normal HTTP header? 200 bytes? Great, you’ve now bloated that by a factor of 8.
How many resources are called when a user requests your page? Maybe an average of 10? That’s 15K you are forcing the user to upload just to view your page. You can’t tell me that will have *zero* effect on perceived speed to the user!
I have a fundamental problem with this approach. Dumbly sending the entire session with every single HTTP request regardless of need is inefficient and against any kind of responsible design. Scaling sessions might be a problem for big sites, but throwing them into the cookie is a user-unfriendly blunt-edged anti-solution.
As you said, if the connection speed is *that* slow, it’s not like the user is having that great an experience anyway – but it can hardly help. The fact that most connections are speed-biased towards downloads just compounds the problem.
In its favour, I agree that Rails has done a stellar job integrating the system and making it “just work”. Maybe cookie sessions are absolutely fine for the vast majority of developers. However, I strongly dislike the idea – as you might be able to tell! – and so don’t use them.
I do agree with your scaling & speeds estimates, though. I doubt CouchDB will ever be as fast as straight MySQL for simple things like sessions. However, I would expect it to scale far better than the other server-side techniques.
UPDATE: the http request for the main page here is 372 bytes for me.
September 20th, 2008 at 6:59 pm
Yeah, I agree with you about the drawbacks of the cookie session store. I personally don’t use it and never have. If your cookies weigh 1.5K then that’s pretty darn wasteful; that’s why they tell you to only store really small things like integers in the session, but I don’t know how big things like integers and strings are when marshalled into the session and encrypted.
My initial objection to the cookie session store was from a security perspective. Even though it’s encrypted I just don’t like the idea of trusting the client to hold on to the session data. There have been security holes in Rails before, and there will be again, just like with any piece of software; who’s to say that the session is really safe, despite the encryption? That’s an oft-raised complaint about the store and the stock reply is, “don’t store anything really sensitive in the store”. I personally would rather just not use the store at all.
September 21st, 2008 at 1:49 am
The cookie I looked at was actually only 700B; I’ve seen 1.5K as an average elsewhere. It might be wrong.
700B looks like this:
That turns into 700B after marshalling and the addition of the ginormous session ID – by itself 362 bytes! It’s easy to see how chucking a bit of flash text and maybe some nice long previous URLs or “pages you just looked at”, a “remembrall” key or whatever could blow out the size real quick. You’re not just storing the values in a hash, remember – you’re storing the key, the class of the key (and the value) .. a single integer might be a byte of data but the coded indicator that it’s an integer, the key, the class of the key .. you could be looking at 20 bytes overhead. Doesn’t sound much but adds up fast and Rails doesn’t exactly encourage you to be niggardly with the session.
I don’t really know how much it uses in the cookie implementation, will have to check that out. This is sizes from the DB, I presume they’re similar.
Whatever. I think the whole issue of scaling the session store has been blown out of all proportion anyway. Presumably people’s sites are not completely static but for the sessions – given the lightweight nature of the sessions I would have thought they were the least of anyone’s worries. How many Rails sites are there that are so popular they have actually outgrown, say, a decent dedicated DB sessions server? That could probably do several thousand transactions a second? My guess is, “none”.
I didn’t change over my sessions management because of scaling concerns, I changed it over because I am a stickler for simplicity and didn’t want to worry about running two types of database. If I ever get big enough that scaling sessions becomes an actual concern then I’ll likely be rich enough to pay someone else to worry about that shit.
September 21st, 2008 at 3:48 pm
Out of curiosity I had a look at the sessions table in my MySQL database. It has an (integer) id column, a VARCHAR(255) column for the session id (which looks to be a 32-character hex-encoded hash), a TEXT blob for the session data itself, as well as two DATETIME columns for “updated at” and “created at”.
I have no idea what’s in the TEXT blobs because they look to be Base64 encoded (why aren’t they just binary data in a binary BLOB column?). Don’t know if they’re also encrypted just like the cookie-backed session data are.
The average size looks to be about 100 characters (bytes), although some longer blobs look as much as 300 or more chars.
(In any case, think I’d better expire some of the old sessions… the table is starting to get pretty big…)
September 25th, 2008 at 2:38 am
Funnily enough, clearing my sessions table turned out to be trickier than I thought.
September 25th, 2008 at 3:22 am
Interesting. You’d think Rails had a built-in rake task that did something a little more nuanced than just dumbly nuking the whole sessions table.
Your wiki post also reminds me of how much I hate screwing around with datetime fields in the DB. I haven’t done it in sessions, but a nice trick is to include a simple “seconds since epoch” integer field as a supplement to the native datetime. Much easier to manipulate, for me anyway. And you can write a nice rake task to just do
Session.destroy(:all, :conditions => {:updated_at_epoch < 4.weeks.ago.utc.to_f})
Ah well, sounds like you have the problem well in hand.
BTW the atom feed on your blog is giving me a 500 error. Can’t tell you for how long it’s been like that; I only just noticed since your blog’s feed is in a folder in Mail.app. I had assumed you were just feeling quiet but saw some fairly recent entries upon visiting your site just now…
September 25th, 2008 at 3:42 am
I am guessing they would look very similar to the unmarshalled data I posted a few comments up. I don’t think they are encrypted but don’t actually have a recent enough Sessions DB to say – I abandoned the native Rails sessions code some time ago, as mentioned in several previous posts.
The simple unmarshal code in the post will answer your question, anyway. If it works, they weren’t encrypted
You tell me, pal. And while you’re telling me stuff, I’d also like to know why the hell the Flash system (which is basically a hash) has to be implemented using a special “FlashHash” class just so it can respond to a few “convenience” methods.
Without this bullshit magic for magic’s sake, the sessions class above would be about 70% shorter. What the hell is wrong with just storing the session as a JSON-encoded hash?
You might have noticed my ardour for everything Rails has somewhat diminished over the last year or so. This kind of crap is exactly the reason why.
September 25th, 2008 at 4:10 am
Ah, thanks for letting me know about the 500 error. Hadn’t noticed it myself. Haven’t investigated it yet but no doubt it will be fall-out caused by one of the “upgrades” to between Rails 2.0 and 2.1.1.
You know, this kind of breakage is exactly why I now make a conscious, disciplined effort to refer to software “updates” rather than “upgrades”.
September 25th, 2008 at 4:17 am
Ok, fixed the 500. Thanks for the heads-up.
September 25th, 2008 at 4:34 am
Yet another reminder that I need more specs… Thing is, I’ve never written specs for XML feeds before. Looks like I’ll have to figure out how.
September 25th, 2008 at 4:51 am
Ah, great. I was missing my twice-weekly dose of “involuntary reboot log” misery
Sigh. I also haven’t done any testing for XML. Fucking testing man, it’s a god damn never ending rabbit hole.
What was the problem, out of interest?
September 26th, 2008 at 1:14 am
The problem was a change in the way routing worked in 2.1. Off the top of my head it was a side-effect of changing from “:controller” to “:as”.
November 6th, 2008 at 8:03 pm
Resurrecting this old thread… I let the sessions table grow and just tried to purge it. 25 minutes to execute the query even with the website shut down and the maintenance page up (ie no contention for or other connections to the database)!
In the light of this I’m actually thinking of switching to the cookie store, even though I’ve never liked it (for the reasons already discussed here and elsewhere).
More details on the query performance at: http://rails.wincent.com/issues/1142
November 6th, 2008 at 8:49 pm
25 minutes does seem like an awfully long time for such a simple operation, even with the quantity of records you describe. Was the table indexed on created_at?
It might be interesting to recreate a huge table like that, with random timestamps, and test different strategies of going about that task. I would also be interested to test CouchDB against that scenario, I haven’t done that yet.
Still, you had let stale sessions accumulate for 30 days or more. After the initial big purge, surely the daily cron job wouldn’t take more than a minute or so? I wouldn’t have thought that was such a big deal, certainly less painful than redoing the whole thing with cookies…
November 10th, 2008 at 3:00 pm
To further update this post, I’d just like to note that I don’t even use Rails sessions anymore anyway. They required so many hacks to make them work the way I wanted, and they never really worked that well anyway, that I just turned them off and replaced with a (much simpler) custom system.
It took a while to get over my fear of messing with the Rails Black Box™ but eventually I just wrote my own, which was very easy, and now I’m much happier. Weird custom classes to store Flash data? Marshalling plain text for unknown reasons? Fuck that shit.
I can write up a (brief) tutorial of how to do this if anyone is interested.
November 12th, 2008 at 6:08 am
Yeah, would be interesting to hear more details about what you did.
And you’re right: 25 minutes is ludicrous. But it seems to be consistently ludicrous. Maybe that kind of query is an insane edge case that triggers a hideously inefficient codepath in MySQL. Who knows?
Good idea to see what indexes are on that table (don’t know as the table was created by Rails using its defaults) and perhaps play around seeing how long it takes to prune it using different methods.
On the other hand, switching to cookie-based sessions would be a one-line environment.rb tweak so may just go that way too.
November 12th, 2008 at 6:09 am
Incidentally, tried updating to the Rails 2.2 RC and everything broke hideously. Still haven’t had time to figure out why. This is why I hate updating Rails.