Quantcast
Channel: SQL Server High Availability and Disaster Recovery forum
Viewing all articles
Browse latest Browse all 4532

Primary replica can't online when clustered instance and AlwaysON use together

$
0
0
Hi All,
We have met a strange big problem on SQL Server 2014(and SQL Server 2012) AlwaysON. The details are:

For example we have a three node cluster WinclsCluster:
Two are an active (name: node-active) and passive (name: passive-node) node for a clustered sql server instance named ClsSQL, and the third node (name: node-three) installed a default non-clustered sql instance. OS is Windows Server Enterprise 2008 R2 SP1. Two sql instances configure an AlwaysON group: AGtest.

Clssql is current on node_active. We manually failover the Clssql from node-active to node-passive, it is successful, so the ClsSQL and AGtest are both on node-passive.

But if I failover the ClsSQL from node-passive to node-active, the problem comes: ClsSQL  failover succeed, but the AGtest remains on node-passive, and the strange is the AGtest still show online, but the database on ClsSQL is in a recover pending state, can't be accessed read or write. In this situation, if sql server is 2014 ,the node-three can access read-only.

In ClsSQL errolog, it has a log:
The state of the local availability replica in availability group 'AGtest' has changed from 'NOT_AVAILABLE' to 'RESOLVING_NORMAL'. The replica state changed because of a startup, a failover, a communication issue, or a cluster error. For more information, see the availability group dashboard, SQL Server error log, Windows Server Failover Cluster management console or Windows Server Failover Cluster log. 
But the sql errolog doesn't tell why AlwaysON Group can't change from RESOLVING_NORMAL to PRIMARY_PENDING, and the cluster log doesn't have any meaningful log either.

To fix this problem, you must either execute the AlwaysON group failover tsql command at ClsSQL, or fail the AGtest:
-- // at clsssql
alter availability group test failover

-- // at cmd 
Cluster.exe res agtest /fail

This problem can be re-produced easily, and this greatly reduce the HA.

Many thanks, and please forgive my poor English.

Viewing all articles
Browse latest Browse all 4532

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>