{"id":1952,"date":"2024-07-23T21:30:38","date_gmt":"2024-07-23T20:30:38","guid":{"rendered":"https:\/\/www.milanoventures.com\/?p=1952"},"modified":"2024-07-23T21:30:43","modified_gmt":"2024-07-23T20:30:43","slug":"tech-outage-cloud-application-resilience","status":"publish","type":"post","link":"https:\/\/www.milanoventures.com\/tech-outage-cloud-application-resilience\/","title":{"rendered":"Global Tech Outage: Ensuring Cloud and Application Resilience"},"content":{"rendered":"
[et_pb_section fb_built=”1″ admin_label=”section” _builder_version=”4.16″ global_colors_info=”{}”][et_pb_row admin_label=”row” _builder_version=”4.16″ background_size=”initial” background_position=”top_left” background_repeat=”repeat” global_colors_info=”{}”][et_pb_column type=”4_4″ _builder_version=”4.16″ custom_padding=”|||” global_colors_info=”{}” custom_padding__hover=”|||”][et_pb_dmb_breadcrumbs _builder_version=”4.22.2″ _module_preset=”default” global_colors_info=”{}”][\/et_pb_dmb_breadcrumbs][et_pb_blurb title=”Global Tech Outages: The Need for Resilience for Internet Connectivity” alt=”WorldDirector globally distributed cloud and CDN” _builder_version=”4.27.0″ header_level=”h1″ background_size=”initial” background_position=”top_left” background_repeat=”repeat” max_width_tablet=”50px” hover_enabled=”0″ global_colors_info=”{}” sticky_enabled=”0″ admin_label=”Global Tech Outage: Ensuring Cloud and Application Resilience”]<\/p>\n
\nIn today’s hyper-connected world, reliable internet connectivity<\/strong> is the backbone of almost every sector. From businesses and healthcare to education and entertainment, uninterrupted internet service<\/strong> is crucial. However, recent global tech outages, such as the one that occurred in July 2024, have underscored the pressing need for enhanced resilience in our internet infrastructure.<\/p>\n<\/blockquote>\n
In July 2024, a major outage was triggered by a faulty software update from CrowdStrike, affecting millions of Windows devices globally. The issue stemmed from a defect in a “content update” for CrowdStrike’s Falcon Sensor, which led to widespread Windows’ Blue Screen of Death (BSOD)<\/a> errors. This disruption impacted various sectors including airlines, banks, media outlets, and public services, grounding flights, disrupting financial transactions, and hindering news broadcasts.<\/p>\n
CrowdStrike CEO George Kurtz confirmed that the outage was due to a technical fault and not a cyberattack. A fix was quickly deployed, but the resolution required manual intervention on affected systems, adding to the complexity and duration of the disruption. This incident<\/a> underscores the critical importance of rigorous testing and validation procedures for software updates to prevent such large-scale impacts in the future.<\/p>\n
This article delves into what is resilience in internet connectivity<\/strong>, how to build a robust and resilient internet connectivity framework, and the symptoms of a failing internet infrastructure during outages.<\/p>\n
Why did the July 2024 CrowdStrike incident affect only Windows computers?<\/h2>\n
The CrowdStrike incident affected only Windows systems because the defective update targeted Windows hosts specifically, causing critical errors in the operating system’s kernel during boot time. This led to widespread Blue Screen of Death (BSOD) errors, a type of crash unique to Windows environments.<\/p>\n
In contrast, Linux systems are better protected against kernel failures at boot time due to several mechanisms. Firstly, Linux typically maintains multiple kernel versions on the system. If a newly updated kernel fails to boot, the system can fall back to a previous stable version. Additionally, Linux bootloaders like GRUB (GRand Unified Bootloader) allow users to select different kernels or recovery modes manually, providing a more robust recovery process in case of critical failures. Moreover, the modular architecture of Linux kernels means that even if a kernel module fails, it often doesn’t render the entire system unbootable, as modules can be loaded or unloaded dynamically without requiring a reboot.<\/p>\n
What is a Global Tech Outage and Its Impact?<\/h2>\n
Understanding What is a Global Tech Outage<\/strong><\/h3>\n
A global tech outage<\/strong> refers to a widespread disruption in internet services, affecting multiple platforms and services simultaneously. These outages can have far-reaching effects, crippling business operations, halting financial transactions, disrupting communication channels, and posing risks to public safety. For instance, the 2024 outage impacted numerous tech giants, causing significant disruptions across various sectors and highlighting our dependence on these platforms.<\/p>\n
Major Outages and Their Impacts<\/h2>\n
List of Significant Tech Outages:<\/strong><\/h3>\n
\n
- CrowdStrike Software Outage (July 2024):<\/strong>\n
\n
- The recent worldwide outage generated by CrowdStrike software caused major disruptions across various sectors, affecting business and government operations globally.<\/li>\n
- Source: The New York Times<\/a><\/li>\n<\/ul>\n<\/li>\n
- Libero Mail Service Outage (January 2023):<\/strong>\n
\n
- In January 2023 Libero Mail (Italiaonline), a major email service in Italy, experienced an extended outage that left millions without access to their emails for several days.<\/li>\n
- Source: Euronews<\/a><\/li>\n<\/ul>\n<\/li>\n
- Akamai Outage (July 2021):<\/strong>\n
\n
- A widespread outage at Akamai Technologies led to the disruption of several major websites and online services, including those of financial institutions and airlines.<\/li>\n
- Source: Reuters<\/a><\/li>\n<\/ul>\n<\/li>\n
- AWS Outage (December 2021):<\/strong>\n
\n
- Amazon Web Services (AWS) experienced a major outage affecting services across the U.S. East Coast, disrupting numerous businesses relying on AWS for their cloud services.<\/li>\n
- Source: CNBC<\/a><\/li>\n<\/ul>\n<\/li>\n
- Microsoft Azure Outage (October 2020):<\/strong>\n