Difference between High Availability data synchronization across local replicas and DAG

July 27, 2018, 4:59 am

≪ Previous: Unable to configure public IP for always on Listener in Azure VM's

I see a significant difference when synchronizing data between Primary - Secondary and between Primary HA and Secondary HA as part of Distributed Availability Groups.

The difference in the amount of data sent across the wire. When making a significant amount of changes on Primary including change tracking enabled tables, we see a huge jump in the amount of data in Redo Queue on all local replicas (synchronous - compression disabled and asynchronous-compression enabled). But across DAG, redo queue is much lower and processing much faster without any significant backlog.

I want to understand on a deeper technical level the difference between local synchronization and DAG.

The specific questions: - What is the difference between synchronizing from Primary to Secondary and across DAG (amount of data) - Can we enable local synchronization to work similar to the one across DAG - What factors that affect the amount of data sent across the wire beside the obvious ones - activity on the DB and HA compression enabled/disabled

↧

always on or FCI

April 16, 2018, 1:26 pm

≫ Next: SQL setup in a failover cluster, virtual disk went offline after restart

≪ Previous: Difference between High Availability data synchronization across local replicas and DAG

if i want to configure all the databases on a sql server instance for a failover capability, should i just go for SQL Server clustered instance rather than setting up always on? The additional investment ive to make for FCI will be using SAN disk instead of local disks. Am i correct in the assumption?

2nd question, Only advantage of using Always on over FCI will be saving money on SAN storage am i right here? is there any other consideration in going for Always on feature?

↧

SQL setup in a failover cluster, virtual disk went offline after restart

July 27, 2018, 12:58 pm

≫ Next: Failed to restore item XXXXXDB01 (Type: Database; Source: XXXXX-SQL1; Target: XXXXX-SQL1) Database restore failed: ExplorerManagementService: Failed to wait for OIB mounted. MountId: [ea3937d8-b768-4e7e-9395-155374f3252b], Ti

≪ Previous: always on or FCI

Does anyone know how to bring a clustered virtual disk online? I can't do it from the gui and powershell hasn't worked thus far.

FriendlyName HealthStatus OperationalStatus DetachedReason
------------ ------------ ----------------- --------------
VDisk02 Unknown Detached By Policy

↧

Failed to restore item XXXXXDB01 (Type: Database; Source: XXXXX-SQL1; Target: XXXXX-SQL1) Database restore failed: ExplorerManagementService: Failed to wait for OIB mounted. MountId: [ea3937d8-b768-4e7e-9395-155374f3252b], Ti

July 29, 2018, 5:16 am

≫ Next: Unable to manual fail over from primary to secondary replica

≪ Previous: SQL setup in a failover cluster, virtual disk went offline after restart

Hello Team,

We are trying a SQL DB restore from Veeam backup , but its getting failed with below error.

7/29/2018 8:49:35 AM Error Failed to restore item XXXXXDB01 (Type: Database; Source: XXXXX-SQL1; Target: XXXXX-SQL1) Database restore failed: ExplorerManagementService: Failed to wait for OIB mounted. MountId: [ea3937d8-b768-4e7e-9395-155374f3252b], Timeout: [00:00:05] (sessionId = 'ebb1f666-0b1f-4eaa-a310-5a3732a5ad87')

Below are the system events noticed

Target failed to respond in time for a login request.

The initiator could not send an iSCSI PDU. Error status is given in the dump data.
"iSCSI discovery via SendTargets failed with error code 0xefff0012 to target portal *X.X.X.80 0003260 ROOT\ISCSIPRT\0000_0 ."

↧

Unable to manual fail over from primary to secondary replica

July 29, 2018, 10:09 am

≫ Next: error while creating AG | Failed to open the Windows Server Failover Clustering registry subkey 'HadrAgNameToIdMap' (Error code 2)

≪ Previous: Failed to restore item XXXXXDB01 (Type: Database; Source: XXXXX-SQL1; Target: XXXXX-SQL1) Database restore failed: ExplorerManagementService: Failed to wait for OIB mounted. MountId: [ea3937d8-b768-4e7e-9395-155374f3252b], Ti

When I try to manual fail over fromprimary tosecondary got the below error. Ran windows clustering validation , not seen any errors . Why I am unable tofail over.Please help on this

Msg 41018,Level 16,State 0, Line 1

Failedtomove a WindowsServerFailover Clustering(WSFC) groupto the local node(Error code 5016).

The WSFCservice may not be runningor may not be accessiblein its current state,or the specified clustergroupor

node handleis invalid. For information about this error code, see "System Error Codes"

in theWindows Development documentation.

↧

error while creating AG | Failed to open the Windows Server Failover Clustering registry subkey 'HadrAgNameToIdMap' (Error code 2)

July 29, 2018, 5:16 pm

≫ Next: Backup integrity checks failing for AlwaysON Availability Group database backups using maintenance plans on non-preferred replicas

≪ Previous: Unable to manual fail over from primary to secondary replica

Please find the error while creating AG .

any workarounds or any ideas to fix this error pls?

TITLE: Microsoft SQL Server Management Studio
------------------------------

Create failed for Availability Group 'xxxx'. (Microsoft.SqlServer.Management.HadrModel)

For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&ProdVer=14.0.17277.0+((SSMS_Rel_17_4).180625-0100)&EvtSrc=Microsoft.SqlServer.Management.Smo.ExceptionTemplates.FailedOperationExceptionText&EvtID=Create+AvailabilityGroup&LinkId=20476

------------------------------
ADDITIONAL INFORMATION:

An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)

------------------------------

Failed to open the Windows Server Failover Clustering registry subkey 'HadrAgNameToIdMap' (Error code 2). The parent key is the cluster root key. If this is a WSFC availability group, the WSFC service may not be running or may not be accessible in its current state, or the specified arguments are invalid. If the corresponding availability group has been dropped, this error is expected. Otherwise, contact your primary support provider. For information about this error code, see "System Error Codes" in the Windows Development documentation.
Failed to create availability group 'xxxxx'. The operation encountered SQL Server error 41030 and has been rolled back. Check the SQL Server error log for more details. When the cause of the error has been resolved, retry CREATE AVAILABILITY GROUP command. (Microsoft SQL Server, Error: 41030)

For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&ProdVer=14.00.1000&EvtSrc=MSSQLServer&EvtID=41030&LinkId=20476

------------------------------
BUTTONS:

OK
------------------------------

↧

Backup integrity checks failing for AlwaysON Availability Group database backups using maintenance plans on non-preferred replicas

July 29, 2018, 11:30 pm

≫ Next: Measuring Data latency on AAG

≪ Previous: error while creating AG | Failed to open the Windows Server Failover Clustering registry subkey 'HadrAgNameToIdMap' (Error code 2)

We configure maintenance plans on all the replicas to take the backups of AlwaysON Availability Group databases. On the preferred replica, the backups happen and on the rest of the replicas backups don't happen. However, when we enable Backup Integrity check in maintenance plans, the job is failing on non-preferred replicas (saying that it is unable to find the backup file while verifying the backup file). When we checked, it is the issue with maintenance plans with AlwaysON Availability Groups. Following is the code generated by SSMS when a maintenance plan is created for an AlwaysON Availability Group (for taking backup with backup integrity check enabled).

DECLARE @preferredReplica int

SET @preferredReplica = (SELECT [master].sys.fn_hadr_backup_is_preferred_replica('AG1'))

IF (@preferredReplica = 1)
BEGIN
BACKUP DATABASE [AG1] TO DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Backup\AG1_backup_2018_07_29_230055_7033159.bak' WITH NOFORMAT, NOINIT, NAME = N'AG1_backup_2018_07_29_230055_7033159', SKIP, REWIND, NOUNLOAD, STATS = 10
END
GO
declare @backupSetId as int
select @backupSetId = position from msdb..backupset where database_name=N'AG1' and backup_set_id=(select max(backup_set_id) from msdb..backupset where database_name=N'AG1' )
if @backupSetId is null begin raiserror(N'Verify failed. Backup information for database ''AG1'' not found.', 16, 1) end
RESTORE VERIFYONLY FROM DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Backup\AG1_backup_2018_07_29_230055_7033159.bak' WITH FILE = @backupSetId, NOUNLOAD, NOREWIND

On non-preferred replicas (ie where sys.fn_hadr_backup_is_preferred_replica returns 0), the backup doesn't happen but backup integrity check is still happening because the command RESTORE VERIFY is outside of IF condition for checking whether the replica is preferred or not. If RESTORE VERIFY is also in the IF block, then it will also be skipped.

Can someone let me know if there is a workaround for this issue within the maintenance plans? I know that if we configure our own jobs to backup and backup verification, we can avoid this issue but we use maintenance plans on all the servers in our environment.

↧

Measuring Data latency on AAG

July 31, 2018, 4:33 am

≫ Next: Accidentally deleted listening group and now I can't recreate it

≪ Previous: Backup integrity checks failing for AlwaysON Availability Group database backups using maintenance plans on non-preferred replicas

I tried to know about the data latency between primary and secondary server with AAG concept.

From sys.dm_hadr_database_replica_states table, can I take it as latency occurred in secondary server if value exist in secondary_lag_seconds column? Or is there any way to find out the data latency between primary and secondary server?

↧

Accidentally deleted listening group and now I can't recreate it

July 31, 2018, 2:25 pm

≫ Next: what's primary diference bwteen Always on availability groups and Always on Basic availability groups

≪ Previous: Measuring Data latency on AAG

It seems I can't create it with the same IP? Is there a way around this?

Script I ran:

USE [master]
GO
ALTER
AVAILABILITYGROUP[ABC]
ADDLISTENER N'ListenerNmae'(WITHIP((N'xxx.xxx.xxx.xxx',N'255.255.252.0'))

,PORT=1433);

Here's the error I'm getting

Msg 41066, Level 16, State 0, Line 3

Cannot bring the Windows Server Failover Clustering (WSFC) resource (ID 'ac46f9b1-41f1-4d40-b6fb-58f61b541fb9') online (Error code 5942). The WSFC service may not be running or may not be accessible in its current state, or the WSFC resource may not be in a state that could accept the request. For information about this error code, see "System Error Codes" in the Windows Development documentation.

Msg 19476, Level 16, State 4, Line 3

The attempt to create the network name and IP address for the listener failed. The WSFC service may not be running or may be inaccessible in its current state, or the values provided for the network name and IP address may be incorrect. Check the state of the WSFC cluster and validate the network name and IP address with the network administrator.

↧

what's primary diference bwteen Always on availability groups and Always on Basic availability groups

August 1, 2018, 8:00 am

≫ Next: Need advice to shutdown SQL Cluster Services

≪ Previous: Accidentally deleted listening group and now I can't recreate it

We plan to use SQL Server 2016 Standard to configure Always on Basic availability groups that support with this edition. However, we are not clear what are primary diference and limitations of Always on Basic availability group compared with the traditional (advanced) Alwayson Availability groups with Enterprise Edition in practice.

We intend to use Always on Basic availability groups to support SQL database failover automatically to work with our Gentec Security center SQL Alwayson failover mode.

Thnaks,

John

↧

Need advice to shutdown SQL Cluster Services

June 18, 2018, 9:08 pm

≫ Next: HA Solution

≪ Previous: what's primary diference bwteen Always on availability groups and Always on Basic availability groups

Hi All,

my company need to relocate office to another building, and also move datacenter to new building.

we have SQL Cluster, if we need to shutdown both of servers, is there any steps should we follow?. example should stop the services first, or etc.

anyone in here ever do relocate the datacenter especially move SQL Server Cluster.

cause i worried this SQL Server have critical business application, so i need plan to mitigate the impact.

thanks in advance

↧

HA Solution

August 2, 2018, 10:42 am

≫ Next: After using Disk Cleanup on a clustered SQL server i can no longer failover resources

≪ Previous: Need advice to shutdown SQL Cluster Services

Hi,

I have a question regarding AlwaysOn solutions for SQL Server.

We have a SQL Server with 3 instances and about 10 databases on each instance(one application each instance and approx. 50-100GB per instance), and we would like to migrate that server to a HA solution with AlwaysOn Availability Groups. If that will be the most suitable solution for our scenario?

My next question is how that solution can be installed the best possible way?
Was thinking of 2 new SQL Servers, with 3 instances on the primary server, and each application dedicated to each of those instances who is grouped into AG. And an identical secondary server with sync from primary, who is read only. Would my idea be suitable for our scenario? How about config of the listeners?

Any other ideas?

Thanks in advance!

↧

After using Disk Cleanup on a clustered SQL server i can no longer failover resources

August 2, 2018, 2:12 pm

≫ Next: AG group nodes - Error: 17836, Severity: 20, State: 14. Length specified in network packet payload did not match number of bytes read... etc

≪ Previous: HA Solution

I was running out of space on the system drive and had already deleted everything that I possibly could, I used the utility Disk Cleanup. I selected the clean up system files and after several hours I had cleared up almost 20GB. I then tried failing over resources to this server and that is when I determined that this Microsoft utility had damaged my sql installation and I can no longer fail over to this server. Anyone have any idea if it is possible to fix this server or am I out of luck. The server itself appears to be running just fine.

Steven Albrecht

University of Colorado Denver

Steven Albrecht

↧

AG group nodes - Error: 17836, Severity: 20, State: 14. Length specified in network packet payload did not match number of bytes read... etc

August 3, 2018, 7:44 am

≫ Next: One database, two locations. Mirror, or cluster, or....?

≪ Previous: After using Disk Cleanup on a clustered SQL server i can no longer failover resources

We have a pair of availability groups set up between two new servers, with each AG running as primary on the opposite node to the other.

SQL Server version is 12.0.5589.7

The AG groups are configured with a listener, along with read only routing for the secondaries.

On both servers, we are receiving repeated errors in the log evey few seconds:-

Error: 17836, Severity: 20, State: 14

Length specified in network packet payload did not match number of bytes read; the connection has been closed. Please contact the vendor of the client library. [CLIENT: 10.x.x.x]

Where the IP is the other node in the Availability Group.

Both AGs appear to be in a healthy state when viewing the dashboard, all databases are synchronised with relatively low redo log queuse. Both AGs are set to replicate in synchronous mode.

I'm wondering where I should be looking next to see where the issue might be coming from?

Cheers

Matthew

↧

One database, two locations. Mirror, or cluster, or....?

August 3, 2018, 10:38 am

≫ Next: DAG Between 2 clusters SQL 2012

≪ Previous: AG group nodes - Error: 17836, Severity: 20, State: 14. Length specified in network packet payload did not match number of bytes read... etc

I have a requirement that has two sites that are geographically separated. The sites are networked together using a high availability dual (redundant) link. Each site needs to have continuous access to a shared SQL Server database. I need to create a system that will recover if the inter-site link somehow fails - both sites should ideally be able to continue to work using a local copy of the database and then resync all changes once the inter-site link is restored.

How would this situation best be accomodated please. (I have explored information about mirroring and clusters and it seems neither quite satisfy my requirement, although it has to be said that I am a novice in this area). Is there any merit in having 2 servers at each site (so 2 + 2) in order to provide further availability at each site?

Many thanks.

↧

DAG Between 2 clusters SQL 2012

August 6, 2018, 2:04 am

≫ Next: SQL Design for DR Site with no AlwaysOn Availability Group

≪ Previous: One database, two locations. Mirror, or cluster, or....?

hello All,

I have 2 sites with an SQL cluster on each of them.

whould like to know if I can use the DAG feature with sql 2012 (I found that this option is available with sql server 2016 not sure about 2012)

also need to understand how to configure it step by step.

Thanks

↧

SQL Design for DR Site with no AlwaysOn Availability Group

August 6, 2018, 5:48 am

≫ Next: SQL Always-On multiple listeners issue

≪ Previous: DAG Between 2 clusters SQL 2012

Hello,

Iam working on a DR configuration for SharePoint 2016 from primary site to a secondary site where the link is 10Mbps (shared - not dedicated). The architecture in place includes 2 servers with all the roles installed on them (front end, SQL, search roles) and a 3rd server acting as a storage server with SQL Standard edition and failover cluster in place.

Can you advise if the best DR architecture practice for the Database cluster across the 2 sites and how the data will be preserved in case of failure. Can you suggest what we shall offer for the SQL?

Thanks

↧

SQL Always-On multiple listeners issue

August 6, 2018, 7:30 am

≫ Next: Alwayson DBs restore to new prod from dev DBs server

≪ Previous: SQL Design for DR Site with no AlwaysOn Availability Group

Hello,

I have configured SQL always on with three nodes on SQL Server 2016 ent edition and listener with 3 IPs.when trying to test the application by using listener name and port number, the application is not able to access the active primary replica. Also, I noticed two IP addresses out of three are in the offline state (listeners IP's).

Do I need to configure additional settings for this?

Thanks

Mastanvali

mastanvali shaik

↧

Alwayson DBs restore to new prod from dev DBs server

August 7, 2018, 2:03 am

≫ Next: Availability Group failover issues

≪ Previous: SQL Always-On multiple listeners issue

Hi I am in the process of moving DB to new Always Cluster. I already restore DBs to new Prod from old prod. Now Always cluster is work and sync. As our Architecture team again need to restore fresh backup to new prod(awayson) from old prod. Basically following steps to do

frist I need to stop DB Sync, remove DBs from sync. then restore DBs to new instance, restore same DB to first replica node. then start Alwayson.

Q1. How to safely remove DBs from cluster.

Many thanks for update

↧

Availability Group failover issues

August 7, 2018, 4:51 am

≫ Next: Alwayson with TDE!!

≪ Previous: Alwayson DBs restore to new prod from dev DBs server

Hi all,

I'm wondering if anyone would be kind enough to help me resolve a current issue I'm experiencing.

The issue exists when trying to failover to a secondary replica within a SQL server availability group.

When failing over using the SQL Server Management Studio failover wizard, it's successful.
However when I check the Windows Failover Cluster Manager Events for the Nodes, the following error messages are present for each node:
- Cluster resource 'xxx-xxx-xxx-xxx' of type 'IP Address' in clustered role 'xxx-xxx failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
- Cluster resource 'xxx-xxx' of type 'SQL Server Availability Group' in clustered role 'xxx-xxx failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

- The Cluster service failed to bring clustered role 'xxx-xxx' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.
Any thoughts/hints would be much appreciated.
Thanks.

↧