Rumors were recently swirling around cellphone carrier T-Mobile. Fears were running rampant that a massive DDOS attack may be underway on the heels of a major outage.
The rumors spread like wildfire through the internet. At one point, even a US Senator got involved, retweeting a rumor that the attack was supposedly based out of China.
The company’s IT staff duly sprang into action, but as they conducted their research, it quickly became apparent that something else was going on. That ‘something else’ turned out to be a problem with one of the company’s leased fiber circuits. Apparently, the company was in the process of making some changes to the way network traffic was routed. Things began to go badly, leading to a series of cascade failures, which ultimately caused a widespread outage.
Neville Ray is T-Mobile’s President of Technology. Once the company had a firm handle on what was going on, he tweeted the following information in a series of messages in an attempt to assuage concerned.
“Our engineers are working to resolve a voice and data issue that has been affecting customers around the country. We’re sorry for the inconvenience and hope to have this fixed shortly.”
A follow up message read:
“Teams continue to work as quickly as possible to fix the voice and messaging problems some are seeing. Data services are now available and some calls are completing. Alternate services like WhatsApp, Signal, iMessage, Facetime etc. are available. Thanks for your patience.”
A few hours after that, Mr. Ray sounded the all clear, stating that the issue had been resolved and apologizing again for the inconvenience.
More complete information recently published on T-Mobile’s website reads, in part as follows:
“This is something that happens on every mobile network, so we’ve worked with our vendors to build redundancy and resiliency to make sure that these types of circuit failures don’t affect customers…
This redundancy failed us and resulted in an overload situation that was then compounded by other factors. This overload resulted in an IP traffic storm that spread from the Southeast to create significant capacity issues across the IMS (IP multimedia Subsystem) core network that supports VoLTE calls.”
Despite the rapid response, many of the company’s customers took to Twitter to express their frustrations. Their frustration is justified with how much more heavily so many people are leaning on technology during the pandemic. Sadly, this will almost certainly not be the last time we encounter problems like this until things begin to return to something closer to normal.