kde.rb: Why You Should Use Ruby
I have been having a lot of fun with Ruby lately.
I now have a Ruby-based renderer for KDE Dot News. The great thing is that I've dumped all the articles and comments from Zope onto the filesystem and it works. It works great... so far.
Ruby is the latest technological craze from Japan by a guy named matz. It has weird stuff like "objects" in it. You could take the number 1, and because it is an object, you can make it quack like a duck.
It gets much worse. Did you know that I can connect from anywhere on the Internets, or at the very least within throwing distance of a Unix Domain Socket, to my running dot.rb instance, and inspect and change anything on the fly?
Yes, it's possible and it's working and it doesn't have any significant overhead. I could give KDE Dot News a sickly blue corporate background, post goatse links, clear the cache tables, or call the garbage collector, all on the fly from any strategically located VT100.
All just for a few lines of code. The distributed stuff works because of an anomaly called DRb -- I don't really know why the rest works.
If you're a KDE developer, you should really give Korundum a serious look. This stuff could blow .Net and Java away.
So, anyway. About dot.rb.
Management-wise, dot.rb's 100% filesystem-based backend is nothing less than a godsend compared to having to deal with a gigantic opaque database.
It's also pure bliss to be able to write dynamic HTML code using Ruby's heredoc and powerful string interpolation features. Ruby has all sorts of template engines but I didn't have to bother with any of that. Never again.
I am so glad to be getting away from the headache that is DTML. Dot's HTML code finally looks tractable, since there is much more sharing and consequently much less code.
dot.rb is fast on ext3, practically instantaneous on my localhost, and even faster on ReiserFS. I've run it on the same machine as the Dot, in parallel to the "production" site, and it was still fast while current Dot (and wiki) crawled.
dot.rb uses 10 times less memory and half the diskspace than present Dot at its worse (meaning several weeks without packing the DB and a few hours of uninterrupted memory leaking).
Of course, dot.rb hasn't been on any kind of load like Zope is. I've tested dot.rb with the full KDE Dot News db and at one point I had 5 simultaneous recursive wgets pulling content and it was still fast... Incidentally, this is using Ruby's built-in webserver; I implemented the logic in a few lines.
But that's still not a realistic load and, of course, there are big gaps in the functionality which could easily close the performance divide come judgement day. Hopefully by the time I'm finished we'll have a 10GHz machine waiting and that won't be a concern.
The reason it's fast of course is that dot.rb is particularly optimised for typical usage patterns of the Dot. Right up front I cache the most recent 100 articles and accompanying comments in a nice little forest of objects. I've also got a dynamic cache table that starts out empty and keeps the most frequently accessed article trees around.
The HTML pages are completely dynamically generated, including Flat Forty and the All Articles list -- both of the latter tend to kill present Dot. I do however use a two-level string interpolation in Ruby and cache strings at the first level.
I lightly process all the site articles up front, compute interesting stuff like previous and next links (present Dot only computes those for the 10 most recent articles, dot.rb does it for all articles) and keep a table of the skeletons around. If you think that would kill startup time, it doesn't really. dot.rb still loads in a fraction of the time that it takes Zope to boot up. If you think it kills memory, nope, doing OK.
To make a long story short, I'm basically making an educated guess about the typical dot reading patterns and optimising for that. This kind of optimisation isn't really feasible in Squishdot since it lies on top of the huge layers of abstraction that is Zope.
Of course, I haven't solved the problem of searching. Stuff like searching on Authors, Titles and Categories should be quite easy and will already be quite useful since this is a function Google cannot readily do for us. What still bothers me is searching the full article and comment bodies for content. I have some ideas, and no, my hierarchical filesystem isn't likely to pan out for this particular case... I will probably need to build an index of some kind or else I could tap Google search for hints and zone in on the search.
Not that searching is working very well for Dot present anyway. Basic searches like the aforementioned-ones work somewhat erratically, others kill the server. Having that Search box on the site-wide footer of the Dot right now is all but meaningless when the thing doesn't work. I guess it's comforting to have it there and it does help maintain the illusion that we have search.
Phase 1 of dot.rb, which is rendering and viewing, is basically done minus search. Phase 2 will be to implement actual posting, which I anticipate to be fairly easy given all that's already been done. Phase 3 will be to implement some sort of management interface for the editors and will probably be slightly tricky... some of those premature optimisations might just come back and bite me.
Sadly, all of this is going to have to wait. Ruby is way too addictive and I need to spend a month or three away in detox, for my own good. Also, I need to get away from some of those crazy people. Hopefully these issues will be addressed in Ruby 2.0.
Did I mention Korundum?
I now have a Ruby-based renderer for KDE Dot News. The great thing is that I've dumped all the articles and comments from Zope onto the filesystem and it works. It works great... so far.
Ruby is the latest technological craze from Japan by a guy named matz. It has weird stuff like "objects" in it. You could take the number 1, and because it is an object, you can make it quack like a duck.
It gets much worse. Did you know that I can connect from anywhere on the Internets, or at the very least within throwing distance of a Unix Domain Socket, to my running dot.rb instance, and inspect and change anything on the fly?
Yes, it's possible and it's working and it doesn't have any significant overhead. I could give KDE Dot News a sickly blue corporate background, post goatse links, clear the cache tables, or call the garbage collector, all on the fly from any strategically located VT100.
All just for a few lines of code. The distributed stuff works because of an anomaly called DRb -- I don't really know why the rest works.
If you're a KDE developer, you should really give Korundum a serious look. This stuff could blow .Net and Java away.
So, anyway. About dot.rb.
Management-wise, dot.rb's 100% filesystem-based backend is nothing less than a godsend compared to having to deal with a gigantic opaque database.
It's also pure bliss to be able to write dynamic HTML code using Ruby's heredoc and powerful string interpolation features. Ruby has all sorts of template engines but I didn't have to bother with any of that. Never again.
I am so glad to be getting away from the headache that is DTML. Dot's HTML code finally looks tractable, since there is much more sharing and consequently much less code.
dot.rb is fast on ext3, practically instantaneous on my localhost, and even faster on ReiserFS. I've run it on the same machine as the Dot, in parallel to the "production" site, and it was still fast while current Dot (and wiki) crawled.
dot.rb uses 10 times less memory and half the diskspace than present Dot at its worse (meaning several weeks without packing the DB and a few hours of uninterrupted memory leaking).
Of course, dot.rb hasn't been on any kind of load like Zope is. I've tested dot.rb with the full KDE Dot News db and at one point I had 5 simultaneous recursive wgets pulling content and it was still fast... Incidentally, this is using Ruby's built-in webserver; I implemented the logic in a few lines.
But that's still not a realistic load and, of course, there are big gaps in the functionality which could easily close the performance divide come judgement day. Hopefully by the time I'm finished we'll have a 10GHz machine waiting and that won't be a concern.
The reason it's fast of course is that dot.rb is particularly optimised for typical usage patterns of the Dot. Right up front I cache the most recent 100 articles and accompanying comments in a nice little forest of objects. I've also got a dynamic cache table that starts out empty and keeps the most frequently accessed article trees around.
The HTML pages are completely dynamically generated, including Flat Forty and the All Articles list -- both of the latter tend to kill present Dot. I do however use a two-level string interpolation in Ruby and cache strings at the first level.
I lightly process all the site articles up front, compute interesting stuff like previous and next links (present Dot only computes those for the 10 most recent articles, dot.rb does it for all articles) and keep a table of the skeletons around. If you think that would kill startup time, it doesn't really. dot.rb still loads in a fraction of the time that it takes Zope to boot up. If you think it kills memory, nope, doing OK.
To make a long story short, I'm basically making an educated guess about the typical dot reading patterns and optimising for that. This kind of optimisation isn't really feasible in Squishdot since it lies on top of the huge layers of abstraction that is Zope.
Of course, I haven't solved the problem of searching. Stuff like searching on Authors, Titles and Categories should be quite easy and will already be quite useful since this is a function Google cannot readily do for us. What still bothers me is searching the full article and comment bodies for content. I have some ideas, and no, my hierarchical filesystem isn't likely to pan out for this particular case... I will probably need to build an index of some kind or else I could tap Google search for hints and zone in on the search.
Not that searching is working very well for Dot present anyway. Basic searches like the aforementioned-ones work somewhat erratically, others kill the server. Having that Search box on the site-wide footer of the Dot right now is all but meaningless when the thing doesn't work. I guess it's comforting to have it there and it does help maintain the illusion that we have search.
Phase 1 of dot.rb, which is rendering and viewing, is basically done minus search. Phase 2 will be to implement actual posting, which I anticipate to be fairly easy given all that's already been done. Phase 3 will be to implement some sort of management interface for the editors and will probably be slightly tricky... some of those premature optimisations might just come back and bite me.
Sadly, all of this is going to have to wait. Ruby is way too addictive and I need to spend a month or three away in detox, for my own good. Also, I need to get away from some of those crazy people. Hopefully these issues will be addressed in Ruby 2.0.
Did I mention Korundum?
