Supporting (very!) legacy systems
Thoughts(I’m AFK today, so I tidied up a half-baked blog post from perennial drafts folder. It’s still not really fleshed out the way I’d usually like, but maybe there’s something there).
When a tired IT worker suggests someone turns it off, then on again, it’s because it’s shocking how often this works. Any functional programmer worth her salt will tell you how bad state is, and sometimes things enter a weird, unknown, or unplanned state that a simple reboot can clear and fix.
The same can be said for servers. A machine acting weird can be interrogated and troubleshooted, but if you’ve architected your infrastructure properly with redundancy, you can kick one box to clear an error state without taking (much) stuff offline, or only briefly reducing redundancy or performance. It literally comes down to a cost/benefit analysis half the time.
Except, we all know this isn’t always tenable. Machines may be sufficiently borked (the technical term) to require a rebuild or complete replacement. But this is where the consummate IT professional runs headfirst into the squishy real world.
My favourite adage from high school economics was “limited resources, unlimited wants”. It’s not just about allocation of limited finances, but of time, energy, and attention. A sysadmin’s want list is likely to differer significantly from what a manager wants, or the NOC wants, or a client wants. This struggle is worth a post itself; and based on my audience, I’ll bet you have your own stories!
But nowhere is this more acutely felt, and strikes fear in the hearts of sysadmins everywhere, than the legacy machine. These are devices with years of cruft, hacks, tape, and hot fixes that have been carried forward long after its expected service life has passed, or the sysadmin would have preferred it replaced. It’s vendor might have even placed it under extended support, or finally abandoned it. How was it built? Who built it? When? Who knows. Just don’t question its importance, it must be kept online, damn it!
Some systems are designed to truck along for years, or happily defy the odds and do so with little input. I worked on an old process control system in the mid 2000s that was running the same embedded version of DOS it shipped with in the late 1980s. There are still COBOL systems running many of the world’s largest banks, with reconciled text files shared over plain old FTP.
But I’m more interested in the unglamorous old boxes sitting in cupboards, or in a lonely rack somewhere. The people who are forced (or otherwise compelled!) to keep these things online are nothing short of superheroes. I’ll bet every fibre of their being is screaming to replace them, perhaps with something newer or simpler. Yet there they are, answering the page at 03:00 that its gone down.
I’m not sure what percentage of the modern world rests atop these machines, and the shoulders of those who maintain them, but I’d wager it’s a lot. All the more reason to hug your sysadmin.