Microsoft Outlook Suffers Major Outage, June 5, 2023

Written by ENow Software | Jun 5, 2023 10:00:00 PM

On Monday, June 5th, 2023, at approximately 10:50 AM (ET), Microsoft tweeted @MSFT365status that they were investigating a service incident involving access issues for Outlook on the web.

For system admins and IT professionals with access to the Microsoft 365 Admin Center, the service incident number initially provided by Microsoft was EX571516.

We're investigating an issue with accessing Outlook on the web. Further details can be found under EX571516 in the admin center.
— Microsoft 365 Status (@MSFT365Status) June 5, 2023

There were hundreds of responses to @MSFT365status from its Twitter community, which is always an indicator of the breadth of the impact of the service incident. Many Twitter responses alleged that the issue was not contained to just Outlook on the web, that several other Microsoft services were exhibiting irregular behavior. Several responses on Twitter from the general Microsoft community confirmed seeing multiple Exchange Online Service Health notifications, and there were many responses both asking if the Microsoft service was being subjected to DDoS attacks.

In their second tweet 30 minutes later, Microsoft had not identified the root cause of the service incident.

We’re reviewing our networking systems and recent updates in an effort to identify the underlying root cause of the issue. Additional information can be found in the admin center under EX571516.
— Microsoft 365 Status (@MSFT365Status) June 5, 2023

By 12:00 pm (ET), Microsoft was confirming a much broader impact to its services, however they disclosed no root cause information at this time. They did provide a second service incident number [MO571683] for those with Microsoft Admin Center access.

We’ve identified downstream impact for Microsoft Teams, SharePoint Online and OneDrive for Business. We’re providing full impact details and updates for those services via MO571683. Updates pertaining to Exchange and Outlook on the web will continue to be provided via EX571516.
— Microsoft 365 Status (@MSFT365Status) June 5, 2023

Microsoft then took steps to halt certain "deployments" and revert "updates", all of which, according to Microsoft claims, improved service performance.

We’ve halted an ongoing deployment and are monitoring services to see if that provides relief to the environment. Further information can be found in the admin center under EX571516 and MO571683.
— Microsoft 365 Status (@MSFT365Status) June 5, 2023

We've reverted the update and telemetry shows service improvement. We're continuing to monitor the service to ensure recovery and performing actions to address any residual impact. Further information can be found under EX571516 and MO571683 in the admin center.
— Microsoft 365 Status (@MSFT365Status) June 5, 2023

Feedback from the Microsoft community during this time was painting a different picture. Most responses on social media were alleging impacts to Microsoft services much broader than Microsoft was communicating, and the claim that the Outlook outage was due to a series of DDoS attacks was now more than just rumor.

Nevertheless, by 1:26 pm (ET), Microsoft was announcing publicly that they "confirmed recovery" of all impacted Microsoft 365 services. Microsoft provided no information whatsoever as to what exactly was the root cause and they did not address the claims of a possible DDoS attack. For Microsoft, the issue was resolved at this time.

We’ve confirmed recovery for affected Microsoft 365 services and will continue to monitor the services to ensure performance stability. Further information can be found under EX571516 and MO571683 in the admin center.
— Microsoft 365 Status (@MSFT365Status) June 5, 2023

Or so it seemed . . . . .

Several hours after their last tweet which indicated a recovery of sorts for all impacted services, Microsoft sent out a new public message at 4:15 pm (ET) announcing that the issue was still ongoing. Microsoft also provided a new service incident number, MO572252, for anyone with access to the Microsoft Admin Center, which purportedly provided more details as to the matter.

We've determined that impact associated with MO571683 and EX571516 has reoccurred and are investigating the cause. We'll be providing updates related to this event under MO572252 in the admin center.
— Microsoft 365 Status (@MSFT365Status) June 5, 2023

By 7:00 pm (ET), Microsoft was still dealing with an unresolved issue. Microsoft's public communications at this time were very lacking as to details and specifics, only making mention that they were actively investigating "the source of the issue" and that they were "implementing changes" in their best possible effort to restore all services for all those impacted.

We're seeing some service improvement, and we remain focused to investigate and address the source of the issue. Further details can be found under MO572252 in the admin center.
— Microsoft 365 Status (@MSFT365Status) June 5, 2023

At approximately 8:30 pm (ET), Microsoft then communicated @MSFT365status that services were returning to "healthy levels". They also were disclosing no information as to the cause of the outage, only that they were still investigating the matter to determine the cause.

We've confirmed that service availability has returned to healthy levels. We'll continue to monitor service health while we analyze system logs to determine the cause of the problem. More details under MO572252 in the admin center.
— Microsoft 365 Status (@MSFT365Status) June 6, 2023

The day-long Microsoft services-disruption appeared to be over . . . . or so it seemed.

What began on June 5th now lived on as a significant service incident into June 6th. Microsoft's next tweet stated that there was a "recurrence of the issue" as well as lingering impacts to service availability. Microsoft was still providing mitigation efforts and, more importantly, they were still investigating the root cause.

We're seeing a recurrence of the issue and a drop in service availability, so we're applying mitigations to provide relief for the affected users, while we continue to investigate the root cause. We'll be providing updates related to this event under MO572252 in the admin center.
— Microsoft 365 Status (@MSFT365Status) June 6, 2023

And then, at approximately 7:00 am (ET), as they had done twice before in the last 24 hours, Microsoft was reporting that all Microsoft services availability were returning to "healthy levels".

We're seeing availability return to healthy levels. We're monitoring the environment while we investigate the underlying cause, increase monitoring, and develop further fixes to provide relief. Updates related to this event can be found under MO572252 in the admin center.
— Microsoft 365 Status (@MSFT365Status) June 6, 2023

In Ground Hog's Day fashion, just two hours later Microsoft sent out its next tweet announcing that the issue had resurfaced. However, to the extent their disclosures at this time were accurate, Microsoft stated that the impact size at this time was much smaller in relation to yesterday's impacts, thanks in part to mitigation efforts applied thus far.

We’ve identified that the impact has started again, and we’re applying further mitigation. Telemetry indicates a reduction in impact relative to earlier iterations due to previously applied mitigations. Further details about the workstreams are in the admin center via MO572252.
— Microsoft 365 Status (@MSFT365Status) June 6, 2023

It would not be until approximately 3:00 pm (ET) on Wednesday June 8th, 3 days after the first reports of the issue surfacing, that Microsoft was able to send a final tweet as to service incident MO572252.

We’ve completed an extended monitoring period without observing any further interruptions to our Microsoft 365 services related to this event. We’ll continue working to finalize all outstanding mitigation efforts. Further details can be found in the admin center under MO572252.
— Microsoft 365 Status (@MSFT365Status) June 8, 2023

As of the morning of June 9th, 2023, there have been no further tweets from @MSFT365status as to service incidents MO571683, EX571516, and MO572252.

However, it was less than 24 hours later that Microsoft announced that they were dealing with a significant new problem: an Azure Portal outage for June 9th.

The Importance of Microsoft 365 Monitoring

In a cloud-world, outages are bound to happen. While Microsoft is responsible for restoring service during outages, IT needs to take ownership of their environment and user experience. It is crucial to have greater visibility into business impacts during a service outage the moment it happens.

ENow’s Microsoft 365 Monitoring and Reporting solution enables IT Pros to pinpoint the exact services affected and the root cause of the issues an organization is experiencing during a service outage by providing:

The ability to monitor networks and entire environments in one place with ENow’s OneLook dashboard which makes identifying a problem fast and easy without having to scramble through Twitter and the Service Health Dashboard looking for answers.
A full picture of all services and subset of services affected during an outage with ENow’s remote probes which covers several Microsoft 365 apps and other cloud-based collaboration services.

Identify the scope of Microsoft 365 service outage impacts and restore workplace productivity with ENow’s Microsoft 365 Monitoring and Reporting solution. Access your free 14-day trial today!

View full post