CrowdStrike Incident: Lessons on Digital Resilience

Written by Brian Fox | July 19, 2024

This morning's CrowdStrike incident, where a routine update caused a cascading failure across thousands of critical systems worldwide, is a stark reminder of the fragile interconnectedness of our digital world. While this incident was a misstep, not malice, it exposes the vulnerability of our essential services.

The widespread outages should serve as a wake-up call. If an error can cause this much chaos, imagine the havoc a deliberate attack could unleash. This scenario isn't just theoretical. The potential for bad actors to exploit these weaknesses is a chilling reality.

Simultaneous attacks on critical services, like 911 dispatch and hospital systems during a mass casualty event, could have catastrophic consequences. Time will tell what today's disruption ultimately has affected.

How Uncommon Is This Type of Incident?

This type of incident itself is fairly common - a botched update caused it for a popular and well regarded cybersecurity software called CrowdStrike. Botched updates happen from time to time. Who among us hasn't had a new bug appear after updating software overnight? What makes this incident unique and different is how far it spreads, and how dire it had.

The faulty update triggered a Blue Screen of Death (BSOD) loop on Windows computers, rendering them useless. This was compounded by the automatic installation of the update on countless machines overnight. To make matters worse, once the computers are stuck on a death loop like this, the only option is for technicians to manually apply the fix - computer by computer. This will take days in the best of cases, causing disruption far and wide.

Thousands of companies globally were crippled, and the recovery process will be painstakingly slow.

The Ripple Effect of a Single Update (or Single Anything)

This incident underscores the interconnectedness of modern businesses and their reliance on complex software ecosystems.

This software supply chain at a typical company can be well over 10,000+ pieces of software and vendors. A single update, like a domino, can topple countless systems. With businesses relying on thousands of software components and vendors, the sheer volume necessitates automated updates. And, as we've seen, automation isn't foolproof.

While bad updates are inevitable, the consequences don't have to be. Organizations need to plan for these disruptions and prioritize business continuity. This incident should also prompt us to question the level of trust we place in our vendors and the robustness of our interconnected systems.

A Blueprint for Disaster

That said, we see the potential of this type of issue at Sonatype, often with the spread of malicious open source on the rise. If you think a botched update is bad, this incident could become a blueprint for far more insidious software supply chain attacks. Imagine a scenario where malicious code is intentionally distributed under the guise of an update, wreaking havoc on a massive scale that couldn't be fixed with just an "update."

The stakes are too high to ignore. After we get everything sorted out, we must use this CrowdStrike incident as a catalyst to fortify our digital infrastructure against both errors and malicious intent. This should make us all stop and consider how interconnected our modern businesses are and what level of trust and reliability should be expected from our vendors.

We'll continue to keep this blog updated as we learn more about this incident and its ramifications. This event has not affected Sonatype's systems.

View full post