The Titanic

I've been blogging for long enough to have suffered a couple of disastrous data losses, but only once have I irreversibly lost stuff. Consequently I backup my blogs daily, but on a hunch early this morning I decided to test my backups on a local installation of WordPress. I choked!

Every evening my server is configured with a couple of cron jobs to do these backups which then get the bzip2 treatment:

  • An SQL query
  • A WordPress RSS/WXR file
  • A grilled cheese sandwich with avocado and gherkins

The SQL backup is fairly vanilla stuff, and always works, as one would expect. WordPress's automatically generated RSS/WXR files are much easier to work with, but from repeated painful experience over many years they're unreliable as heck. Perhaps I should phrase that to say "easier to work with… when they work"!

You’re telling us this… why?

I belabor all this because when I tested the WXR files my server has been exporting lately, they don't include categories when you reimport them into vanilla WordPress installs. None. Nada. Zippo. This despite setting the PHP memory ceiling higher as I talked about before and splitting up the WXR files as recommended by various folks.

Despite being introduced years ago, WXR is still horribly broken. As far as I can tell from trudging through the source WordPress doesn't even use a XML parser when importing them. I suppose that's another reason why I don't use sites like WordPress.com, when things like this mess up I can always access the database directly, and why on smaller projects I use SQLite3 databases which you can cheat on and backup by just copying over one file! Ah and I'm nostalgic for sqlplus already ;).

What do you guys do for data backups for your online stuff?