Monday, November 25, 2024

Can Auto Updates for Critical Infrastructure Be Trusted?

Must read

COMMENTARY

In July, the industry witnessed one of the largest technology outages in recent history, with estimates of $5.4 billion in damages. When CrowdStrike distributed a Rapid Response Content Channel Update with an exception-handling logic flaw, it opened the door for constructive conversations about automatic updates — when to use them, when not to use them, whether they make us more or less secure. It’s time to reflect and ask: What is the cost of our relentless pursuit of innovation, software currency, and speed to market? How can we reprioritize to reestablish the balance in the C-I-A triad?

IT and security teams are under enormous pressure to stay ahead of threats. However, teams must not sacrifice the right checks and balances for speed. The CrowdStrike incident serves as a reminder to the industry that even the most secure and trusted systems can fail, and it’s time to revisit how teams test and deploy critical updates.  

The C-I-A Triad: Rebalancing Priorities

The C-I-A triad is a foundational pillar of cybersecurity, representing the Confidentiality (security), Integrity (accuracy), and Availability of technology platforms. For too long, the cybersecurity community — vendors and customers alike — have fixated on the C in this triad. However, the C-I-A triad is supposed to represent the full scope of a cybersecurity program. With the main focus on privacy and data security, the industry over emphasized security — and in doing so, added speed to the equation. Teams are now responding faster and deploying updates quicker to stay ahead of emerging threats and day-to-day attacks, but that’s leading to mistakes and improper testing.  

Meanwhile, the I and A were relegated to secondary status — even outsourced to other technology teams. Integrity — the accuracy, completeness, and consistency of the ecosystem and underlying data — was compromised in the name of speed. Availability also suffered as the focus shifted to rapid recovery rather than ensuring uptime and reliability, all for the sake of rapid innovation and response to perceived threats.  

If the CrowdStrike event has taught us anything, it is that now is the time for both vendors and customers to recommit themselves to recognizing the integral importance of and essential need to rebalance all three pillars of the C-I-A triad. In doing so, teams can build more resilient systems.  

The Shift From Software to Critical Infrastructure

Leaders need to undertake three key shifts to achieve the essential checks and balance systems inherent to the C-I-A triad.  

1. Transparency: Vendors must be more transparent with their product updates and give customers more control over how updates are applied. Customers should be able to manually update, deploy updates in stages, and remain on a prior stable version as a matter of policy.  

In the case of the CrowdStrike event, the complex update caused the outage. First, the team deployed a configuration file in February. Later, in July, it deployed a Rapid Response Content Update. As part of that update, a configuration content validator, using the prior configuration file, attempted to apply the update, but due to the “logic bug” in the exception handling routines, the staggered update resulted in the infamous “blue screen of death” for many Windows servers and workstations. These channel updates are often a series of staged updates, all occurring at once. How many of CrowdStrike’s customers understood this nuance of the update strategy? It’s unclear, but they had limited control over the update and were unable to stage it so it could be certified and tested before affecting the entirety of the enterprise.  

2. Reevaluate vendor testing: Platforms such as CrowdStrike have transformed to become a core component of critical infrastructure. Security vendors frequently push automatic updates to improve security, but this can also mean speeding through the “trust but verify; walk before you run; test test test” cycles. While speed matters, this incident should force teams to take a closer look at how they deploy updates, ensure integrity and availability, and maintain business resiliency.  

IT and security teams must reevaluate overreliance on vendor testing and automatic updates. Even small teams can have the flexibility to choose when to update without incurring substantial overhead. The update is automatic — but the time and place to update can be chosen. Leaders should consider implementing staggered updates, using staging and testing environments to certify and assess the viability and stability of the update. More credence and consideration should be given to the value of updating now versus waiting to give more ability to ensure that the integrity and availability won’t be compromised by the update.  

3. Improve testing environments: Companies must ensure that cybersecurity teams have adequate testing environments available for certifying and testing security updates and implementations. The same diligence given to IT and development teams must be applied to cybersecurity.  

Security is no longer software; it’s a foundational component of critical infrastructure. As seen with the CrowdStrike event, banks, transportation, manufacturing, and financial markets can all be devastated by a failure of the security ecosystem. As the industry continues to see convergence of solutions to a few vendors, it’s important to make these platforms more resilient.   

The true measure of our cybersecurity prowess lies in our capacity to endure. Teams should embrace those proven patterns of change management that have served us well in the past, but also evolve and expand in scope to accommodate new technology and new potential threats. Vendors must empower customers with greater control and flexibility in how and why they deploy our solutions and updates. Technology and security practitioners, in turn, must use this moment as a clarion call to rethink priorities and recommit to balancing and counterbalancing the security, integrity, and availability drivers that empower our security tools.   

This creates a durable security future, regains and rebuilds essential fiduciary trust, and ensures that teams can rise to every threat while never again falling into complacency, valuing speed and ease at the expense of everything else. 

Latest article