Home » news »

15 Years of Cloud Outages: A Stroll Through the InformationWeek Archives

 

The cloud is growing, but cloud outages are nothing new. And neither are we. InformationWeek was first founded in 1985 and our online archives go back to 1998. Here are just a few lowlights from the cloud’s worst moments, dug up from our archives. 

Apr. 17, 2007 / In Web 2.0 Keynote, Jeff Bezos Touts Amazon’s On-Demand Services, (As a reminder, folks, here in 2022, AWS is now worth a trillion dollars. )

Aug. 12, 2008 / Sorry, the Internets are Broken Today, Oct. 17, 2008Google Gmail Outage Brings Out Cloud Computing Naysayers, June 11, 2010 / The Cloud’s Five Biggest Weaknesses, [In 2022, a cloud SLA can accomplish basically nothing at all. As Richard Pallardy and Carrie Pallardy wrote this week, “Industry standard service level agreements are remarkably restrictive, with most companies assuming little if any liability.”]

April 21, 2011 / Amazon EC2 Outage Hobbles Websites, “The new architecture works great when only one disk or server fails, a predictable event when running tens of the thousands of devices. But the solution itself doesn’t work if it thinks hundreds of servers or thousands of disks have failed all at once, taking valuable data with them. That’s an unanticipated event in cloud architecture because it isn’t supposed to happen. Nor did it happen last week. But the governing cloud software thought it had, and triggered a massive recovery effort. That effort in turn froze EBS and Relational Database Service in place. Server instances continued running in U.S. East-1, but they couldn’t access anything, more servers couldn’t be initiated and the cloud ceased functioning in one of its availability zones for all practical purposes for over 12 hours.”

Aug. 9, 2011 / Amazon Cloud Outage: What Can Be Learned? July 2, 2012 / Amazon Outage Hits Netflix, Heroku, Pinterest, Instagram, July  26, 2012 / Google Talk, Twitter, Microsoft Outages: Bad Cloud Day, Oct. 23, 2012 / Amazon Outage: Multiple Zones a Smart Strategy, Okta’s director of technical operations told Babcock that they use all five zones to hedge against outages. “If there’s a sixth zone tomorrow, you can bet we’ll be in it within a few days.”

Jan 4, 2013 / Amazon’s Dec. 24 Outage: A Closer Look, Nov. 15, 2013 / Microsoft Pins Azure Slowdown on Software Fault, May 23, 2014 / Rackspace Addresses Cloud Storage Outage, July 20, 2014 / Microsoft Explains Exchange Outage, Aug. 15, 2014 / Practice Fusion EHR Caught in Internet Brownout, Sept. 26, 2014 / Amazon Reboots Cloud Servers, Xen Bug Blamed, Dec. 22, 2014 / Microsoft Azure Outage Blamed on Bad Code, Jan. 28, 2015 / When Facebook’s Down, Thousands Slow Down, Aug. 20, 2015 / Google Loses Data: Who Says Lightning Never Strikes Twice? Sep. 22, 2015 / Amazon Disruption Produces Cloud Outage Spiral, May 12, 2016 / Salesforce Outage: Can Customers Trust the Cloud?, March 7, 2017 / Is Amazon’s Growth Running a Little Out of Control? Writes Babcock: “Given the fact that the outage started with a data entry error, much reporting on the incident has described the event as explainable as a human error. The human error involved was so predictable and common that this is an inadequate description of what’s gone wrong. It took only a minor human error to trigger AWS’ operational systems to start working against themselves. It’s the runaway automated nature of the failure that’s unsettling. Automated systems operating in an inevitably self-defeating manner is the mark of an immature architecture.”

Fast Forward to Today

As Sal Salamone detailed neatly this week, in his piece about lessons learned from recent major outages: Cloudflare, Fastly, Akamai, Facebook, AWS, Azure, Google, and IBM have all had calamities similar to this in 2021-22. Human errors, software bugs, power surges, automated responses having unexpected consequences, all causing havoc.

What will we be writing 15 years from now about cloud outages?

Maybe more of the same. But you might not be able to read it if there’s lightning in Virginia. 

What to Read Next:

Lessons Learned from Recent Major Outages

Can You Recover Losses Sustained During an Outage?

Special Report: How Fragile is the Cloud, Really?

 

Related Posts

  • No Related Posts