Happiness is a stack of new interesting computer books!

If you've been on the net for a while, chances are your development history roughly follows mine! I've had people asking about my Perl script that I've lovingly referred to here over the years, so here's something :).

Ah the innocence of youth…

Before we go any further, yes that was my bedroom back in Singapore. I still have that lava lamp, that MacBook Pro and my beloved O'Reilly books :)

After graduating from GeoCities and the like to my own web server, I wrote a fairly simple Perl CGI script to run my site before finally moving over to dedicated CMSs including RapidWeaver, MediaWiki, WordPress and TextPattern. Since then, I've dabbled in Ruby/eRuby/eRubis, Ruby on Rails, Django and PHP, though admittedly my biggest strengths still lie in the back end with running and building VPSs or uni servers with FreeBSD, CentOS, Solaris, Apache, Lightty, *SQL and the like.

While not the most high performing or scalable systems, my Perl CGI script was a lot of fun! In a nutshell, it worked like this.

  1. Each of my blog posts was stored as a text file with the name of file corresponding to its stub, such as “shimapan.txt”. The files themselves were simply formatted as post name, date, category and content, with newlines delineating each. No fancy markdown, just HTML sans paragraph tags!

  2. When a page was requested, the script would attempt to open its corresponding file. If this operation failed, a 404.txt file was returned in its place.

  3. The script would pull the text, wrap each paragraph in <p> tags, open a rudimentary “template” text file and insert it between two HTML comments.

  4. For example, when using the teal Hatsune Miku theme, going to //rubenerd.com/?get=shimapan would put shimapan.txt into the mikutheme.txt file, then display a result any otaku would be proud of.

It just occurred to me this was also coded before Miku's time. Or even Akiyama Mio. I should have made a joke about Haruhi Suzumiya and bunny girls instead. You live and learn.

Processing and such fun

I eschewed (gesundheit) the CGI.pm module, mostly because I found it easy enough to create the standard "Content-type: text/html" and other headers myself. I was (and still am) a fan of minimalism, and my C training at the time made me adverse to using third party modules and APIs, especially when it was easy enough to do something myself.

The theme files and the posts themselves were uploaded to the server using SVN; they were "committed". To this day, I still think that was a really nifty way to do it. I've seen some people doing that with Git now.

Not showing any more than this, sheesh!

Learning from bad example?

I learned a lot from that site. Shortly after going live, I noticed thousands of hits to the 404.txt file. Obviously, bots had found my URL, and were filling random junk into it. I wrote in some rules that a stub could only be up to 24 characters long, and only contain alphanumeric characters. Any requests that didn't meet that criteria were automatically dropped.

I also learned about FastCGI. Unbeknownst to me in my fresh-out-of-high-school innocence, each time someone accessed my poor web server, Apache was spawning an entirely new CGI process to handle their request. Admittedly my site only started to get traction long after I ditched the script and moved onto dedicated CMSs, but had I kept using it I'm sure I would have been in for a nasty surprise!

There were also other performance issues. I never quantified how much of a hit my server was taking by having to open and close theme and data files each time, though I suppose I could have written a simple SQLite backend and compared it.

I also wasn't doing any caching; ideally once a post was completed and the theme set in stone, the script could have generated pages upon the post being uploaded, and served those instead. IIRC, this is how MovableType operated back in the mean old days. There weren't any dynamic parts to the site such as comments systems, tags and so on, so they would only need to be updated if the theme changed.


Why am I posting about this, years from now? Because after writing a fairly basic Ruby on Rails install, playing with Django for a university assignment last semester, and my obsessive Wikipedia editing, I've decided to trial running Rubenerd.com off TextPattern! But as all infuriating entries end: that's for another post.