Defending my Internet Archive uploading methods


I have received now four rude emails, two comments with explicit language which I've deleted, along with a friendly email and comment here asking me why I decided NOT to use an automated script to do my Internet Archive uploading written in my beloved Ruby or Perl for example. There are several reasons, but the primary one stems from this simple fact:

You cannot automate a system to pick up and transfer data that doesn't exist.

The problem is, up until the beginning of last year I didn't tag any episodes. The Internet Archive requests tags, and I figure because they're providing the space and bandwidth gratis I should provide them. There really is no automated way to generate tags, only humans can tell what subject matter is and provide appropriate tags. If computers could accurately generate tags, the Google Image Labeler Project wouldn't exist!

I could probably use the Advanced Contribution Engine system and upload episodes en masse, but this simply doesn't address this underlying problem. I could spend my time writing scripts to write XML files and mass upload them, but until I've manually written the tags for each episode and posted them on my own site first, none could get uploaded. The effort it takes to upload files individually after I've created the tags and have all the data I need is minimal, I may as well do the two operations in tandem then wait until I've done everything on one end, then execute a script that could potentially cause cascading problems… for 262 uploaded media files which each have their own associated files!

ASIDE: Jim Kloss, the scripting wizard behind Whole Wheat Radio once said on air that throwing technology at a problem doesn’t automatically make it faster or easier to solve; in many cases you just make it more complicated. This would be an example of this.

I'm not pretending that this won't take a bit of extra time, but I feel this is the best way to do it. I just wish I had been able to spend my time uploading shows rather than defending what I'm doing, sheesh ;-).

Author bio and support


Ruben Schade is a technical writer and infrastructure architect in Sydney, Australia who refers to himself in the third person. Hi!

The site is powered by Hugo, FreeBSD, and OpenZFS on OrionVM, everyone’s favourite bespoke cloud infrastructure provider.

If you found this post helpful or entertaining, you can shout me a coffee or send a comment. Thanks ☺️.