The Five Whys in IT, and finding Rachelbythebay


Have you ever found yourself with a blown lightbulb, but:

  • you need a ladder to replace it
  • but it’s buried under boxes
  • but to move the boxes you need to clear space
  • but you can’t because you haven’t vacuumed
  • but you can’t because the vacuum cleaner bag is full
  • …and you don’t have replacements?

Eventually your partner asks why you haven’t replaced the lightbulb, and there you are under your computer desk crimping a new Ethernet cable saying yes, that’s what I’m doing!

This happens a shocking amount of the time in my life, professionally and personally. You notice a loose thread on your jumper, and pretty soon you’ve unravelled the entire thing in a mess of cotton and regret. It was a shame too, because you really liked that jumper, and it’s suddenly very cold in here.

🌲 🌲 🌲

There’s actually a phrase for this style of thought process in industry called the Five Whys. It was drilled into me in my SCADA days, but Rachelbythebay reminded me of it in this brilliant post from February.

More than five whys and “layer eight” problems

The Five Whys is a process to trace a problem to a root cause by iterating on why a prescribed number of times. Some advocates say it’s only useful if you limit to a strict number like five, but often times it can take much more. Not to get all Malcolm Gladwell on you, but turns out that definitive number can be hard to pin down.

But here’s the galaxy-brain realisation: my experience is if you ask enough times, almost everything ends up being caused by a whom, not a what. This is an entirely different kettle of fish, with it’s own set of challenges. A question about why a package wasn’t updated may eventually lead to management rejecting a funding proposal, which is outside the purview of an employee to address. This sucks, because guess who gets prescribed blame! Rachel touches on scapegoating in her post too.

On the other side, you may have people blinkered by the technical issues who refuse to acknowledge the human element at all, or don’t even consider it. Once you recognise this antipattern, you see it everywhere in forums, technical Q&A sites, and on social media. An example might be: it was leaked, because it was sent using unsecured email, because they weren’t using PGP, because the person was… stupid for not knowing how to use it? Yes that’s it, solved.

Latacora: The PGP problem

Rachel’s whole post is worth a read, but she nails it here, emphasis added:

… trying to roll back through the series of actions (or lack of actions) to see how things happened, and then trying to do something about it. The problem is that if you do this long enough, eventually the problems start leaving the tech realm and enter the squishy human realm …

In some situations, you come to realize that a whole bunch of bad things happen due to non-technical causes, and they are some of the hardest things that you might ever need to remove from an organization.

She concludes:

I guess this is my way of warning anyone who fancies themselves a troubleshooter and who really, truly, wants to get to the bottom of things. If you do this long enough, expect to start discovering truly unsatisfying situations that cannot be resolved.

I remember the physics teacher at our school half-joking that chemistry and biology were easier than his subject, because they dealt with the “real world”. This is backwards in IT. Life would be easier if every problem could be reduced to an malformed SQL query, or forgetting a semicolon in a header file.

Meatspace has misunderstandings, malevolence, motives, and money.

Author bio and support


Ruben Schade is a technical writer and infrastructure architect in Sydney, Australia who refers to himself in the third person. Hi!

The site is powered by Hugo, FreeBSD, and OpenZFS on OrionVM, everyone’s favourite bespoke cloud infrastructure provider.

If you found this post helpful or entertaining, you can shout me a coffee or send a comment. Thanks ☺️.