New web content engine, part 8

A brief recap

I’m realizing that I need to get Rm together and stuff while the iron’s hot. See, the idea of Rm predates CouchDB, but if I release it too long after CouchDB gets buzz, everybody will think I just stole some ideas from that.

The basic idea behind Rm comes from my longstanding experience with Lotus Notes (I’ve had Notes on my resume since 2001, although I prefer to minimize that knowledge because I don’t want to be asked to support other people’s hacked-up Notes applications as a day job) I’ve always thought that Notes had a fairly interesting core, with layers of antiquated crap wrapped around it. So my basic idea was that a document-centric database was a useful alternative for cases where SQL wasn’t quite right. This rolled around in the back of my head for a while. I had read up about PostgreSQL several times before and pronounced it a cool idea in my head before, but eventually my windows-based hosting company died, so I decided to use PostgreSQL when I set up my linux server.

I read through the documentation and realized that there were absolutely nifty features in PostgreSQL, but that nobody was really using them. And I started to realize that I could make these features do interesting things that you just can’t do in a normal database. So I could actually write a real document-centric object-oriented-ish database like Zope or Notes without actually writing a complete database.

Now, I wrote this with the goal towards solving problems, in a progressively more complicated order. Problem number one is that I wanted a better blogging framework for my site, that would let me update things easily.

I worked on this for a stretch in 2005 and then put the code away for a while. When I got back to it in 2007, after having spent a bunch of time on Flickr (which I got on after a conversation about tagging with a friend) I ended up realizing that, while most of the model was sound, there were a few things that history was starting to ignore, so I refactored those things and then concentrated on getting it up and running as fast as I could.


User interface design is often defined not by what’s there, but what isn’t.

If there’s an option for comments, depending on how you write your articles, you may or may not attract people’s comments… but because the option is always there, you are allowing yourself to potentially attract posting. Even if you don’t actively seek comments out. On the other hand, if you don’t have comments enabled, you are guaranteed to not get any, which changes the feel of a site. For example, historically I have not bothered adding comments to my blog because I wanted to avoid making things look like LiveJournal.

Also, having a powerful commenting system set up for a site that gets very few comments makes the site look rather lame.

Eventually I decided that needed comments, so I wrote a quick one-off commenting system. I’ve cut up those pieces of code and some of the thinking and added the functionality here.

Getting your fundamentals really right can be quite useful. I sat down with my new commenting system and prodded it for HTML, XSS, and SQL injection bugs and realized that it was all handled already.

Syndication win

So, one of the advantages of splitting the formatting side and the internal logic is that you can go outside of your own tool easily. I wanted to better syndicate my other content on the web, so I pulled in Atom feeds and used the same formatting logic for those feeds as I use for bunches of internal content.

I still need to look at a feed normalizer that can turn a wide variety of weird nonstandard feeds into a well formed Atom feed, however.

Some thoughts on what’s next

The codebase has been growing of late and I’d like to shrink it. There’s a lot of quick hacks in there just so that I could get something working, which now means that I’m going to have to shrink some of that to be more uniform. I realized, after playing with Rails, that there’s some helpful scaffolding I can write to eliminate a lot of duplicated code.

There’s a lot of helpful scaffolding right now, of course. I spent some time writing a setup tool that will build the database for you based on how you have configured things and have moved most of the configuration parameters into YAML files.

I need to get rid of REXML. Like I said earlier, It’s very slow and I’ve managed to push off most performance problems by adding a cache for the fully generated pages, which is a very very useful thing, but eventually I’m going to run out of ways to use that.

I’m fairly happy with the development flexibility that I get by using the XML formatting pipeline the way it is, so I’m not really too interested in rewriting it.

There’s also a few ideas that I really want to implement sooner rather than later. So the other goal is to finish these features up, migrate all of my web content over to using Rm, and start finding more users who are interested in being “seeded” with a fairly early version of Rm.