Failure to follow procedures caused US-wide AT&T outage (2024)

An AT&T cellular outage lasting more than 12 hours that prevented subscribers from accessing services including 911 was caused by misconfigured hardware and a failure to follow standard procedures when deploying.

Or so says the US Federal Communications Commission (FCC) of the incident on February 22, which affected AT&T wireless customers across all American states, plus Puerto Rico and the US Virgin Islands. The outage cut off access to voice and 5G data services nationwide, as The Register reported at the time. It took AT&T at least 12 hours to fully restore service, during which more than 25,000 attempted 911 calls could not get through.

Based on its investigations, the FCC's Public Safety and Homeland Security Bureau today referred the matter to its Enforcement Bureau for potential violations of FCC rules, which may result in a fine for America's second largest wireless carrier.

In its published findings [PDF] on the disruption, the FCC says it was caused by an AT&T Mobility employee adding a "misconfigured network element" to the production network, intended to expand capacity, during a routine nighttime maintenance window.

The process did not follow AT&T's established install procedures, which require peer review, and this led to the misconfiguration not being detected before the network element was introduced into the infrastructure.

As a result, an automated response was triggered that shut down all connections to prevent traffic from the misconfigured device propagating further into the network. This shutdown isolated all voice and 5G data processing elements from the wireless towers and switching tech, according to the FCC report.

The outcome was that the AT&T Mobility network disconnected all devices from voice services and 5G data, starting at 0245 Central Standard Time, just three minutes after the misconfigured network element was added, causing a nationwide outage of its wireless service.

This not only affected consumers, but also any devices of the First Responder Network Authority (FirstNet) – in other words, the emergency services – that were registered to AT&T Mobility's network.

"When you sign up for wireless service, you expect it will be available when you need it – especially for emergencies," FCC chairwoman Jessica Rosenworcel said in a statement.

She added that the agency is taking this failure seriously and is working to provide accountability for the lapse in service and prevent similar outages in the future.

To address the interruption, AT&T's Network Operations performed a rollback that removed the misconfigured network element and then began the process to restore the network to normal operations, the FCC report states.

However, while most of AT&T's mobile subscribers were reconnected by early morning, the FCC notes that traffic congestion from so many mobile device registrations prevented some from getting back on the network, although these issues were mostly resolved by midday. FirstNet devices and infrastructure were given priority over commercial and residential users such that FirstNet service was restored by 5:00 AM.

The FCC also notes that AT&T notified FirstNet users of the outage starting at 5:53 AM, which was more than three hours after the outage began and approximately 53 minutes after the FirstNet infrastructure had been restored.

It wasn't until 7:05 AM that the company issued a public statement about the dodgy service, followed by additional updates throughout the morning, and a statement released at 2:10 PM indicated that wireless service had finally been restored to all affected customers.

The report by the FCC Public Safety and Homeland Security Bureau concludes that the outage was as the result of multiple factors, all attributable to AT&T Mobility.

As well as the configuration error, these include a lack of adherence to internal procedures, a lack of peer review, a failure to adequately test after installation, inadequate laboratory testing, insufficient safeguards and controls covering approval of changes affecting the core network, a lack of controls to mitigate the effects of the outage once it began, and a number of system issues that prolonged the outage once the configuration error had been remedied.

While the direct cause of the outage was the employee who misconfigured a single network element, adequate peer review should have prevented the change from being approved, the FCC says.

The agency adds that post-installation testing should have ensured that network changes were implemented properly, but "to the extent that testing was performed when the misconfigured network element was placed into the production network on February 22, 2024, they were inadequate and failed to identify the incorrect behavior of the network element."

The report states that AT&T Mobility either lacked sufficient oversight and controls to ensure these test processes were followed, or, if these controls do exist, they are themselves insufficient.

A further criticism is that despite configuring its network to enter Protection Mode to prevent propagating errors to other parts of the network, AT&T failed to put in place adequate preparations for the congestion that would result as every device tried to re-register with the network upon restoration of service.

In a statement, AT&T told us: "We have implemented changes to prevent what happened in February from occurring again. We fell short of the standards that we hold ourselves to, and we regret that we failed to meet the expectations of our customers and the public safety community."

The FCC report confirms that AT&T has taken steps to avoid a repeat of the calamity, including scanning the network for any network elements lacking the controls that would have prevented it. The company has also now adopted procedures to ensure that maintenance work cannot take place without confirmation that required peer reviews have been completed.

In its recommendations, the FCC report says the blackout highlights the need for carriers to adhere to best practices, implement adequate controls in their networks to mitigate risks, and be capable of responding quickly to restore service when an outage occurs.

The agency says it plans to release a Public Notice, based on its analysis of this and other recent outages, reminding service providers of the importance of implementing relevant industry-accepted best practices, including those recommended by its Communications Security, Reliability, and Interoperability Council (CSRIC). ®

Failure to follow procedures caused US-wide AT&T outage (2024)

References

Top Articles
10 of the richest YouTubers ever – net worths, ranked, from KSI to PewDiePie
100+ of the Best 40th Birthday Wishes - Funny, Heartfelt, and More
Kmart near me - Perth, WA
Wordscapes Level 6030
Blairsville Online Yard Sale
Vanadium Conan Exiles
Nyuonsite
Deshret's Spirit
Jesus Revolution Showtimes Near Chisholm Trail 8
Smokeland West Warwick
Aries Auhsd
Umn Biology
Camstreams Download
World Cup Soccer Wiki
Tripadvisor Near Me
123Moviescloud
Babyrainbow Private
Local Dog Boarding Kennels Near Me
Samsung Galaxy S24 Ultra Negru dual-sim, 256 GB, 12 GB RAM - Telefon mobil la pret avantajos - Abonament - In rate | Digi Romania S.A.
What Happened To Anna Citron Lansky
Cinebarre Drink Menu
Lake Nockamixon Fishing Report
Mani Pedi Walk Ins Near Me
The Ultimate Style Guide To Casual Dress Code For Women
Mychart Anmed Health Login
Transactions (zipForm Edition) | Lone Wolf | Real Estate Forms Software
Hannah Palmer Listal
Drying Cloths At A Hammam Crossword Clue
The 15 Best Sites to Watch Movies for Free (Legally!)
Fiona Shaw on Ireland: ‘It is one of the most successful countries in the world. It wasn’t when I left it’
The Eight of Cups Tarot Card Meaning - The Ultimate Guide
Grave Digger Wynncraft
Opsahl Kostel Funeral Home & Crematory Yankton
Chapaeva Age
Suspect may have staked out Trump's golf course for 12 hours before the apparent assassination attempt
Goodwill Thrift Store & Donation Center Marietta Photos
Dallas City Council Agenda
Eleceed Mangaowl
ATM Near Me | Find The Nearest ATM Location | ATM Locator NL
Craigslist Jobs Brownsville Tx
'Guys, you're just gonna have to deal with it': Ja Rule on women dominating modern rap, the lyrics he's 'ashamed' of, Ashanti, and his long-awaited comeback
Who Is Responsible for Writing Obituaries After Death? | Pottstown Funeral Home & Crematory
Energy Management and Control System Expert (f/m/d) for Battery Storage Systems | StudySmarter - Talents
6576771660
Squalicum Family Medicine
Tom Kha Gai Soup Near Me
Hampton In And Suites Near Me
Is Chanel West Coast Pregnant Due Date
Wvu Workday
Causeway Gomovies
Electronics coupons, offers & promotions | The Los Angeles Times
Salem witch trials - Hysteria, Accusations, Executions
Latest Posts
Article information

Author: Corie Satterfield

Last Updated:

Views: 5512

Rating: 4.1 / 5 (62 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Corie Satterfield

Birthday: 1992-08-19

Address: 850 Benjamin Bridge, Dickinsonchester, CO 68572-0542

Phone: +26813599986666

Job: Sales Manager

Hobby: Table tennis, Soapmaking, Flower arranging, amateur radio, Rock climbing, scrapbook, Horseback riding

Introduction: My name is Corie Satterfield, I am a fancy, perfect, spotless, quaint, fantastic, funny, lucky person who loves writing and wants to share my knowledge and understanding with you.