Quantcast
Channel: SQL Server High Availability and Disaster Recovery forum
Viewing all articles
Browse latest Browse all 4532

How is SQL AlwaysOn AG supported in Azure without 3 fault domains?

$
0
0

For a Windows cluster to achieve quorum there needs to be 3 nodes. Hence, all tutorials guide you to create 3 VMs in the same availability set. (See image here.)

However, Azure only makes a guarantee for the first two VMs in an availability set to end up in different fault domains. Any additional VMs might end up in the same fault domain as the first two. (See here, and hereunder Partitioning.)

How then can Microsoft support SQL AlwaysOn AG on Azure VMs without true quorum?
------------------------------
Here's what happened:

We host a very large multi-tenant software in Azure IaaS backed by a SQL cluster configured, according to best practices, as 3 VMs in one availability set. Yesterday the passive and witness nodes simultaneously crashed, the active node promptly shut down the cluster since majority vote was lost. When the nodes were restarted (within minutes) they showed that they crashed, leaving me with the only assumption that they are on the same hardware and we indeed only get two fault domains.

This happened 3 times. In one day. With a few hours between each outage.

Support confirmed there was indeed an outage on the hardware hosting those VMs.

So back to the question above, without the proper hardware configuration, how can this even be supported for production?

We invested months of research and testing on different platforms till we chose our current configuration on Azure. However, it's impossible for us to continue hosting a production system on Azure with such a bug in its current clustering/HA support.


Regards, Shloma Baum FieldOne Systems http://www.FieldOne.com


Viewing all articles
Browse latest Browse all 4532

Trending Articles