Exchange Monitoring: Managed Availability Part 2
How to check, recover, and maintain your Exchange organization
Now that you’ve finished Part I of...
Now that you’ve finished Part I and Part II of my three-part Managed Availability blog series, I will now provide some information about local .xml monitoring files and overrides of Managed Availability.
Some HealthSets, such as the FEP HealthSet are local .xml files. Because FEP is the Forefront Endpoint Protection service, some of you may want to disable this HealthSet on the servers, because there is no use for it.
Browse to %ExchangeInstallationPath%\Microsoft\Exchange\V15\Bin\Monitoring\Config, search for FEPActiveMonitoringContext.xml and open the file with an editor, such as Notepad. Change line 12 by replacing Enabled = True to Enabled = False. Restart the Microsoft Exchange Health Management service on the server where you modified the .xml file.
With overrides, you can change the Managed Availability exchange monitoring thresholds and define you own settings when Managed Availability in case of errors should take action.
There are two kinds of overrides:
Local overrides: are used to customize a component on a specific server or on components which aren’t globally available. For example, if you are running multiple datacenters and would like to change only server components on a specific location for individual monitoring. Local overrides are managed with the *-SetMonitoringOverride set of cmdlets. They are stored in the registry under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ExchangeServer\v15\ActiveMonitoring\Overrides\ and are automatically updated every 10 minutes. The Microsoft Exchange Health Management service reads the changes in the registry path above.
Global overrides: are used to customize a component for a whole Exchange organization. They are managed with the *-GlobalMonitoringOverride set of cmdlets. Global overrides are stored in Active Directory:CN=Overrides,CN=Monitoring Settings,CN=FM,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=Xiopia,DC=local
You can set overrides for specific Exchange versions, such as CU6 with version “15.0.995.29”. This setting will then be effective until the Exchange version changes and will be set with the ApplyVersion parameter.
The other method is to set overrides for a specific timeframe. With Exchange 2013 CU6 you can set overrides for a maximum of 365 days with the Duration parameter.
Responders only execute in the event that a monitor is marked in an Unhealthy state and will try to recover that component. Managed Availability provides multi-stage recovery actions:
Restart the application pool
Restart the service
Restart the server
Take the server offline so that it no longer accepts traffic
There are several types of responders available: Restart Responder, Rest AppPool Responder, Failover Responder, Bugcheck Responder, Offline Responder, Escalate Responder, and Specialized Component Responders. In this article, I will be primarily discussing Restart Responders.
Restart responders are subject to throttling policies. This means, the responder definition contains a section "ThrottlePolicyXML" which can be overridden if desired. For example, we use the "StoreServiceKillServer" responder. To view the definitions, use the following cmdlet via EMS:
(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/ResponderDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like "StoreServiceKillServer"}
There are many parameters, such as ServiceName, CreatedTime, Enabled, MaxRetryAttempts, AlertMask, and so on. The following section from the restart responder definition is important:
ThrottlePolicyXml :
<ThrottleEntries>
<ForceReboot ResourceName="<SERVERNAME>">
<ThrottleConfig Enabled="True" LocalMinimumMinutesBetweenAttempts="720"
LocalMaximumAllowedAttemptsInOneHour="-1" LocalMaximumAllowedAttemptsInADay="1" GroupMinimumMinutesBetweenAttempts="600" GroupMaximumAllowedAttemptsInADay="4" />
</ForceReboot>
</ThrottleEntries>
The thresholds are self-explanatory. The only difference is "Local" and "Group." Local means one Exchange server and group means there is more than one Exchange server in your organization. You have to check and configure the setting based on your needs.
To prevent a reboot, create a local or global override:
I was looking for a "*ForceReboot*" by Managed Availability and found the following Requester:
(Get-WinEvent -LogName Microsoft-Exchange-ManagedAvailability/* | % {[XML]$_.toXml()}).event.userData.eventXml| ?{$_.ActionID -like "*ForceReboot*"} | ft RequesterName
ServiceHealthMSExchangeReplForceReboot
Add-GlobalMonitoringOverride -Identity Exchange\ServiceHealthMSExchangeRepIForceReboot -ItemType Responder -PropertyName Enabled -PropertyValue 0 –Duration 90:00:00:00
To check the configuration changes, use the following cmdlet:
(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/responderdefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like "ServiceHealthMSExchangeRepIForceReboot "} | ft name,enabled
This prevents the server from a force reboot in case of errors with the “ServiceHealthMSExchangeRepl” health set. Enabled must be "0" (instead of 1).
Inform Managed Availability about the repairing process
To inform Managed Availability (and your Exchange monitoring software too) that you are in a repairing process, use the following cmdlet and define using the Name Parameter for the appropriate Monitor:
Set-ServerMonitor –Server <SERVERNAME> -Name Maintenance –Repairing $true
After repairing:
Set-ServerMonitor –Server <SERVERNAME> -Name Maintenance –Repairing $false
To avoid automatic recovery actions, you should disable the managed service using Set-ServerComponentState:
Set-ServerComponentState –Component RecoveryActionsEnabled –Identity <SERVERNAME> -State Inactive –Requester Functional
After finishing recovery you have to enable the RecoveryActionsEnabled component with the following cmdlet:
Set-ServerComponentState –Component RecoveryActionsEnabled –Identity <SERVERNAME> -State Active –Requester Functional.
Watch all aspects of your Exchange environment from a single pane of glass: client access, mailbox, and Edge servers; DAGs and databases; network, DNS, and Active Directory connectivity; Outlook, ActiveSync, and EWS client access.
Dominik is a Microsoft MVP primarily specializing in Microsoft Exchange, Exchange Online and Office 365. Dominik currently works for a German consulting company, AtWork. At atwork, Dominik focuses on designing and building message infrastructures and cloud technologies. Dominik has worked in IT since 2004, primarily with Exchange Server but also has experience with Windows Server, Active Directory, Azure, Office 365, Unified Messaging and various third party products.
Microsoft introduced a new built-in exchange monitoring system...