Hello all,
I have a 3-nodes AlwaysOn cluster (Windows Server 2008 R2 SP1 + SQL Server 2012 RTM), Node Majority quorum, the quorum vote for each node is 1.
Today the AlwaysOn AG was suddenly down due to the cluster service on node 1 stopped and can't be started.
The error in eventlog is -
The cluster database could not be loaded. The file may be missing or corrupt. Automatic repair might be attempted.
The Cluster Service service terminated unexpectedly. It has done this 2 time(s). The following corrective action will be taken in 120000 milliseconds: Restart the service.
The failover cluster database could not be unloaded. If restarting the cluster service does not fix the problem, please restart the machine.
The Cluster Service service terminated with service-specific error The system cannot find the file specified..
The error log in cluster log is -
0000156c.000008f8::2012/09/05-08:09:36.057 INFO [DM] Key \Registry\Machine\Cluster.restored does not appear to be loaded (status STATUS_OBJECT_NAME_NOT_FOUND(c0000034))
0000156c.000008f8::2012/09/05-08:09:36.057 WARN [DM] Node 1: Failed to unload restored hive from the registry with error STATUS_INVALID_PARAMETER(c000000d)
0000156c.000008f8::2012/09/05-08:09:36.057 INFO [DM] Node 1: loading local hive
0000156c.000008f8::2012/09/05-08:09:36.057 ERR [DM] Node 1: failed to unload cluster hive, error 2.
Now the cluster service can't be started on node 1, error code 2. Looks like the clusdb in C:\windows\cluster is missing or corrupted.
Can you kindly let me know how to restore the clusdb file? And how to prevent this happen again?
PS. All nodes were well patched, AlwaysOn and cluster related hotfixes were all installed. http://social.msdn.microsoft.com/Forums/en-US/sqldisasterrecovery/thread/127bd81c-65cd-4e15-b561-ec11fc4f6d11 doesn't help:(
Thanks in advanced!!!!!
starlee