When CDNs and DDoS vendors go offline

Internet

Everyone notices when a content delivery network or distributed denial of service protection vendor goes offline, because they take half the modern web with them. Much of the world’s Internet traffic is transmitted and delivered by just a handful of these vendors.

For a global network originally designed by the US military for resiliency, our current situation seems ridiculous. Why would everyone put their trust into a just a few players like this? Is it ignorance? Penny pinching? Bad design?

Content distribution

Prior to large streaming platforms, I’d argue BitTorrent was the most widespread, reliable, and cost effective way for most people to get video and audio online. The protocol meant that no one system shouldered the burden or responsibility of distributing content to every user; provided a full copy could be assembled, the network was resilient to outages. It’s an elegant, robust solution that worked for millions of people.

Media companies don’t like this. They want to retain:

  • ultimate control of the source files, such that local copies can’t (as easily) be pirated or redistributed, and can be injected with ads; and

  • how people access them, including the ability to limit and revoke if they need or want to.

Whether required contractually for licencing deals, or because they were spooked by the rampant piracy BitTorrent facilitated, streaming gave them a solution that satisfied those two criteria.

It’s a classic example of a meatspace limitation being imposed on digital architectures. Without the distributed advantage of protocols like BitTorrent, the client-server streaming model relies on massive servers and pipes to work, which few providers can deliver or maintain at that scale. The ultimate irony from an architecture perspective is that these media companies now routinely sneakernet drives to large ISPs in different countries to help them locally cache and deliver content.

CDNs apply that principle to every site, but for economic reasons. You don’t need a physical point-of-presence on every continent to deliver your assets. Performance is a key metric people use to validate your service, and modern web users have a low tolerance for latency and lag. Large data centre providers and cloud platforms will then “peer” with these CDNs, such that traffic can operate directly between them. It’s a reality that if you operate outside a CDN (as I do for all my stuff), the perception of your site’s performance will likely suffer depending on where they are.

The turnkey, low maintenance, and relatively cost effective solution then locks platforms in, which means they’ll probably continue using them even when they’re at the scale and size that they probably could roll their own global distribution system. If you’re a beancounter allocating resources between reinventing the wheel or adding features, which would you choose?

The Internet was designed to be a robust network of peers that can route around damage. This model doesn’t break OSI, but its lopsided nature introduces brittleness, which is all too often on public display.

What about DDoS attack vendors?

More of the general public now know names like Cloudflare and Fastly from their proxied forwarding pages (“this site is protected by XYZ”), and from when tired NOC engineers go to social media to explain that their site is offline because of such services.

Unfortunately, the architecture of the modern web makes their use all but necessary.

The Internet was designed with an implicit level of trust between nodes. It assumed people wouldn’t spoof their IP headers, read your cleartext communications, or perform too much mischief. Amplification attacks and rentable botnets now make DDoS attacks a regular fact of modern sysadmin life, and depressingly easy and affordable to perform.

There are no effective mitigations. You can’t feasibly block the source of a distributed botnet attack. Paying protection money, or giving into ransom demands by DDoS attackers only emboldens them, and may be illegal. In layman’s terms, the only way to survive is to hope you have a bigger pipe than the attacker can saturate. And again, there aren’t many choices at that scale.

Like CDNs, the concern is that sysadmins and their managers may have become complacent in their use. It’s tempting to throw a DDoS protection provider in front of your server and call it a day, but you’ve arguably substituted one problem (attacks) with another (potential brittleness).

Where we go from here

That’s the open question!

I’d love to see a reversal of the consolidation we’ve seen, which will only happen if people appreciate placing eggs in fewer baskets is a problem. I work at a small provider that’s trying to do this, with local “clouds” and kit in people’s own facilities rather than centralised. People like Jason Tubnor maintain and operate a fleet of bespoke servers without the need for any cloud services; check out his awesome blog and BSD talks. I know other people who have feet in both cloud and bare metal to spread risk while still having burstable capacity.

We need diversity again, and there’s big money behind convincing people that’s either undesirable, expensive, or impractical.

The legislative cat is out of the bag for DDoS attacks; cryptocurrency has made anonymous threats and payments feasible while they boil the planet for their shitcoins (pardon the French). The only way I see we’ll bring them under control, at least with our current protocol suite, is doubling down on endpoint security. It was entirely preventable and completely ridiculous that the so-called Internet of Things weren’t designed with security in mind, and we’re now paying the price.

It’s a controversial position, but I’m coming around to thinking ISPs should be more involved in scanning for vulnerable devices and notifying customers. Diffusion of responsibility is a real problem that will require the collective effort of everyone on the Internet.

The Internet and WWW survived far longer than even its designers expected, but I think we’re at a junction now. I hope there are enough of us who still care about this stuff.

Author bio and support

Me!

Ruben Schade is a technical writer and infrastructure architect in Sydney, Australia who refers to himself in the third person. Hi!

The site is powered by Hugo, FreeBSD, and OpenZFS on OrionVM, everyone’s favourite bespoke cloud infrastructure provider.

If you found this post helpful or entertaining, you can shout me a coffee or send a comment. Thanks ☺️.