Exchange 2010 –

Posted: November 27, 2016 at 9:50 am

In my experience, when you have Exchange 2010 in a volatile environment, you open yourself up to the cluster behaviors of the Exchange cluster to behave unexpectedly, or just plain fail. If you're running in DAC mode, then to gain those benefits, you have to manually fail/fix sites to prevent services from staying down, or worse, going split-brain.

Because Exchange 2010 relies heavily on Microsoft FCS (Failover Clustering Service) and AD (Active Directory), there are many scenarios where these distributed decision making functions can fail. When all the servers fail in the primary data center, the second data center takes over as it should, and when the primary data center comes back online, it does not automatically fail back; this is by design (per Microsoft). I have found that to fail services back, you must do two crucial things:

The sites seem to recover after a few minutes, but the changes are not immediately apparent, and the databases take a few minutes to re-mount. The reasons for these commands were not readily obvious to me, but I've come to the conclusion that the following conditions must be considered:

Also, the Microsoft documentation is decent (not great) on this, and is definitely worth reading:

A bit about this environment:

More here:

Exchange 2010 -