Back to Blog

Office 365 Monitoring - July Outage 2019

Image of Nathan O'Bryan MCSM
Nathan O'Bryan MCSM
Microsoft 365 Outage banner

As IT Pros a major part of our responsibility is to keep our organizations IT services up and running. Historically this was a pretty straight forward job. It’s never been an easy job, but your software on your servers connected to your network makes everything straight forward. Moving services to Office 365 makes things much more complicated. How do you manage an outage for a cloud service? Is there any point to monitoring a cloud service when you can’t do anything to fix an outage?

In this blog post I’m going to look at a recent Office 365 outage and talk about what we as IT Pros should be doing to ensure that we’re helping the organizations we work for get the most out of their Office 365 subscription.

Recent Office 365 outages

As I’m writing this on the morning of July 8, if your organization is setup using Exchange Hybrid with some mailboxes on-premises on Exchange before 2013 SP1, you may be experiencing a Free/Busy “outage”. On Friday 7/5 Microsoft switched the certificates for the Federation Gateway leaving some organizations with older on-premises deployments of Exchange without working Free/Busy.

The solution for this outage is to simply run the following PowerShell cmdlet.

Get-FederationTrust | Set-FederationTrust -RefeshMetedata

Not a huge problem, but certainly something that could be avoided with some careful monitoring of your Office 365 deployment.

On July 2nd just before 1 PM pacific time, the Microsoft Twitter account @MSFT365Status tweeted about an outage caused by a network device within Microsoft’s infrastructure. Microsoft later provided more information about this outage under MO184196 at status.office.com.

M365_outage

On July 3rd, Office 365 had an outage in SharePoint Online. The incident for that outage was SP184328, and more information about that outage can also be found on status.office.com.

That’s three examples of problems within Office 365 that may have affected your users. If you’re in a situation where knowing about these outages before your end-users started calling your helpdesk, maybe it’s time to start looking into a third-party monitoring solution.

How do I know if “the cloud” is down?

The move from on premises IT services into cloud services can be a tough transition for organizations and IT department alike. This transition is the largest change in how IT works that I have seen in my nearly 30-year career. Even those of us that are accustomed to working in dynamic environments will find this transition over whelming.

When your organization moves to Office 365 and other cloud services, it can become much more difficult to identify an outage. It’s highly unlikely that Office 365 will ever be completely down. Outages do happen, but they are generally limited to a part of the service.

Microsoft does try to provide some information about outages to administrators via the Office 365 portal and the Office 365 Admin app. While both tools will give you some information about some Office 365 outages, you do have to keep in mind that Microsoft isn’t going to go out of the way to point out flaws in their service. I’m not saying that Microsoft will “hide” information about outages, but they are not going to go out of their way to point out outages that they think would otherwise go unnoticed.

If there isn’t anything I can do about an outage, why do I need to know about it?

If there is an outage in Office 365, there typically isn’t much you can do to fix it. If your organization is small with a limited IT budget maybe you don’t really need to be the first to know about Office 365 outages.

However, many larger organizations will have an internal customer base that is used to being notified of outages, instead of being the ones to report outages to the IT department. If your users need a higher quality of IT services that does not depend on them reporting problems to you, it may be appropriate for you to setup a third party monitoring system that has the capability to notify you of Office 365 outages before your end-users report them to you.

Beyond just notifying your end-users of an outage, there might be something you can do to get your user community up and working again. Some organizations deploy third party solutions for backing up Office 365 data, or even third-party solutions that fill in an “outage gap” to bring minimal features back to the users. Often these solutions will require some administrator intervention to activate their services. If that’s your situation, you need to know about outages before you can activate your contingency plans.

Monitoring Office 365 Outages

The savvy readers among you have probably put together that this blog post on ENow’s website aligns well with the service that ENow sells. Of course, the good folks at ENow would love for you to sign up for their Office 365 monitoring service, but I’m not going to turn this blog post into an ad.

The fact is some Office 365 customers could use an additional layer of monitoring to ensure that they are aware of Office 365 outages and able to respond appropriately before their end-users are affected. If that is the situation you find your organization in, I think it’s a good idea for you to talk to the people at ENow about their monitoring solution.

If you think your IT department can provide better service to your user community with the help of an improved Office 365 monitoring solution, check out the link below to find out more about ENow’s solution.


The Importance of Office 365 Monitoring

In a cloud-world, outages are bound to happen. While Microsoft is responsible for restoring service during outages, IT needs to take ownership of their environment and user experience. It is crucial to have greater visibility into business impacts during a service outage the moment it happens.

ENow’s Office 365 Monitoring and Reporting solution enables IT Pros to pinpoint the exact services effected and root cause of the issues an organization is experiencing during a service outage by providing:

  • The ability to monitor entire environments in one place with ENow’s OneLook dashboard which makes identifying a problem fast and easy without having to scramble through Twitter and the Service Health Dashboard looking for answers.
  • A full picture of all services and subset of services affected during an outage with ENow’s remote probes which covers several Office 365 apps and other cloud-based collaboration services.

Identify the scope of Office 365 service outage impacts and restore workplace productivity with ENow’s Office 365 Monitoring and Reporting solution. Access your free 14-day trial today!


Teams Outage banner

Office 365 Monitoring: Microsoft Teams Outage #2 May 6, 2020

Image of ENow Software
ENow Software

On May 6th, 2020 at ~11:30pm UTC, Microsoft reported an outage causing intermittent Teams calls to...

Read more
Teams Outage banner

Office 365 Monitoring: Teams Outage February 12, 2021

Image of ENow Software
ENow Software

On February 12, 2021 at ~1:30 am UTC, Microsoft reported an issue that was preventing users from...

Read more