Hi,
We have one alwayson group which has only one replica(primary) and one dummy database, then we create a listener, it works fine until we found there is a IP conflict yesterday, a virtual machine was assigned same IP address, now the replica is in resolving status, and database is in recovery pending(not synchronizing).
I removed the alwayson availability group and tried to recreate it again.
1:
DROP AVAILABILITY GROUP [AGDB1A2];
GO
2:
Use master
Restore database [AGDB1A2] with recovery
CREATE AVAILABILITY GROUP [AGDB1A2]
WITH (AUTOMATED_BACKUP_PREFERENCE = SECONDARY)
FOR DATABASE [AGDB1A2]
REPLICA ON N'SYDCO-SSQL-1A\INSTANCE2' WITH (ENDPOINT_URL = N'TCP://test.alz.com:23', FAILOVER_MODE = MANUAL, AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT, BACKUP_PRIORITY = 50, SECONDARY_ROLE(ALLOW_CONNECTIONS = NO));
GO
Until this step it is all good, then run the following t-sql statement to create listener is failed:
USE [master]
GO
ALTER AVAILABILITY GROUP [AGDB1A2]
ADD LISTENER N'AGDB1A2' (
WITH IP
((N'10.61.198.20', N'10.255.248.0')
)
, PORT=1433);
GO
Those are the logs:
1044:
Encountered a failure when attempting to create a new NetBIOS interface while bringing resource 'AGDB1A2_10.61.198.20 online (error code '1450'). The maximum number of NetBIOS names may have been exceeded.
1069:
Cluster resource 'AGDB1A2_10.61.198.20 of type 'IP Address' in clustered role 'AGDB1A2' failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
1205:
The Cluster service failed to bring clustered role 'AGDB1A2' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.
1254:
Clustered role 'AGDB1A2' has exceeded its failover threshold. It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state. No additional attempts will be made to bring the role online or fail it over to another node in the cluster. Please check the events associated with the failure. After the issues causing the failure are resolved the role can be brought online manually or the cluster may attempt to bring it online again after the restart delay period.
Does anyone know how to resolve it?
Thanks,
Albert