Quantcast
Channel: SQL Server High Availability and Disaster Recovery forum
Viewing all 4532 articles
Browse latest View live

SQL alwaysON - cluster issue

$
0
0

Hi,

I have a SQL alwaysON replication setup between on-prem db and Azure SQL db ( IaaS). Today morning my primary DB was not accessible due to cluster issue. Cluster service on one of the node was stopped due to some network connectivity issue with file share witness( quorum). 

My question is- if the cluster has to have an issue due to network or any other issue , is that going to affect my primary database ? 

How this can be minimized or eliminated? are there any config changes I need to do? 

Please help.


Not Able to connect my secondary replicate to test read only

$
0
0

Hello everyone,

I have set up two Nodes availability group running on WSFC with file share quorum.

when I try to test using sqlcmd with Listener and -K readOnly I will get below error:

C:\Windows\System32>sqlcmd -S APPLSNR -d Choice -K ReadOnly
Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : TCP Provider: No connec
tion could be made because the target machine actively refused it.
.
Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : Login timeout expired.
Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : A network-related or in
stance-specific error has occurred while establishing a connection to SQL Server
. Server is not found or not accessible. Check if instance name is correct and i
f SQL Server is configured to allow remote connections. For more information see
 SQL Server Books Online..

when Connect without -K ReadOnly, I am able to connect the Primary Node:

C:\Windows\System32>sqlcmd -S APPLSNR -d Choice
1> select @@servername
2> go


--------------------------------------------------
------------------------------------------------
AEXVMSQLDB02\AEXSQLDB02


(1 rows affected)

I can ping the Listener Name (APPLSNR)

please help!

Availability Repicas:

Server Instance           Role     Availability Mode        Failover Mode          Connection Primary Role         Readable Secondary

Node 1                 Secondary   sync                          Manual                  Allow All Connection               Read-intent Only

Node 2                 Secondary   sync                          Manual                  Allow All Connection               Read-intent Only

SQL server 2016 Replication issue - The process could not execute 'sp_replcmds' - MSSQL_REPL20011 and MSSQL_REPL22037

$
0
0

Hi All,

i am facing issue with Transnational replication. there is NO errors in Distributor to Subscriber. seems issue in Publisher to Distributor(Not sure). I can see below errors in REPL monitor,

The process could not execute 'sp_replcmds' on (Source: MSSQL_REPL, Error number: MSSQL_REPL20011)
Get help: http://help/MSSQL_REPL20011

The statement has been terminated. (Source: MSSQLServer, Error number: 3621)
Get help: http://help/3621

The process could not execute 'sp_replcmds' on . (Source: MSSQL_REPL, Error number: MSSQL_REPL22037)
Get help: http://help/MSSQL_REPL22037

Database owner set to SA on Primary and Secondary. 

in Addition to that below  is the error in MSRepl_Errors table,

The process could not execute 'sp_replcmds' on 'PUBLISHER'.
Query timeout expired

Please assist on this  issue. 

Regards,

SJB,


Recover data from ldf file

$
0
0
How to retrieve data from ldf file? There is a day older full backup file. I could not attach database because mdf file is not working. Now I need to use the full backup file and ldf file to retrieve data.

AlwaysOn

$
0
0

Hi,

Can anyone assist on query ?

If the always on AG group is in resolving state . At cluster level ,the role is in error or missing .

Does this needs to be solved at Sql level or Window cluster level first . 

My understanding is first cluster need to be fixed as AG sits on top of Cluster 

Thanks

Aslam

Maximum Allowed Network Link Latency between two data-centers for SQL Always-On

$
0
0

Hi, 

We are implementing SQL AL-Ways on between two data-centers and looking for network latency requirements. What is maximum allowed network latency for SQL always on group database to keep sync the database with synchronous and also for asynchronous. 

   

Always-on : recovery_lsn vs commit_lsn

$
0
0

recovery_lsn : I found it has different sequence (or format ) with another lsn column
recovery_lsn : 4294967295429496729500001   last_harden_lsn :38000001549400001

And what's the actually meaning of last_commit_time ??

for sync-mode : the commit_time is always the same as primary replica >

for async-mode: there may be difference ?

How to monitor how many data lag between primary and secondary replica

$
0
0

What's the best way to estimate the data difference between primary and secondary , replica (async)…..??


secondary_lag_seconds

$
0
0

for the column secondary_lag_seconds of sys.dm_hadr_database_replica_state:

it should be equal to the difference of last_commit_time between primary and secondary replica?

Ghost Cleanup : Low_water_mark

$
0
0

Seems ghost cleanup is easy to understand. but  low_water_mark_for_ghosts

of sys.dm_hadr_database_replica_states is hard to understand??

what's low_water_mark?

Connecting to DB through always-on listener IP is so slow

$
0
0

We have deployed SQL Always-on which is consist of two MS-SQL 2017 servers. When we try to connect through always-on listener IP, the connection takes time and its pretty slow. 

However, its connecting fine using listener DNS name.

Why the connection is slow using the always-on listener IP. Moreover, some time connection timeout while connecting through listener IP. 



Readonly routing to secondary replica

$
0
0

Hi,

I have requirement that user should be only connect to secondary replica for read connection through the listener. I tried various options and it didn't work such as disable login on primary replica, deny connect on primary replica.

Anyone have any suggestions here.

Regards,


-kccrga


Alwayson Setup (Synchronous Commit) Between Production and DR Site

$
0
0

Hi,

The current setup between the production and DR site is asynchronous commit. The two sites are separated over 350 Kilometers but with no network latency and high bandwidth.

Can we make the third node on the DR site as synchronous commit? What is the impact of doing this change?

How to measure if it is working fine?


-kccrga




Replication in sql server

$
0
0

Dear all,

kindly be informed that I want to achieve the next scenario:-

I have two nodes(1,2) on always on and I have another node outside the always on

I want to do replication between the primary database(in always on)

and the third machine(outside always on) and vice versa.

in case the failover the replication occur between the secondary database(new primary) and

the third machine and vice versa.

Many thanks.

Failsafe to avoid logins to connect to Primary in Read_Only Routing

$
0
0

Hi,

I don't have the servers to test this currently, so posting here. 

So future config plan is to have read only routing on readable secondaries with connection strings having application intent set to read_only. 

Above should suffice normally, but i am looking into a failsafe, so that the app can't login for reads by even a mistake to the primary.

I was under the impression that disabling those logins on primary would be a simple workaround? Please let me know, thanks.


D


Readable Secondary yes option not available

$
0
0

In the HA group, readable secondary yes option is not available. 

Error While creating Availability Group (Error 19435, 41044)

$
0
0

Dear all,

I have a big issue with a new availability group installation/configuration. It does an error and do not create the group...

It seems that the group goes online and then is killed by the failover cluster... But I don't see why. I do have searched the web about my issue but I have tried everything proposed :

1. Grand privileges to NT AUTHORITY\SYSTEM (Connect SQL to, View server state to, Alter any availability group to

2. Local admin for the agent /engine service account on Windows and on the SQL database

3. Delete my cluster and recreated it

4. Tried creating the group without the listener

5. Have exactly the same Hardware configuration (HDD / RAM / CPU)

Here the log from the SQL Server (from SSMS)

08/13/2019 09:07:01,spid55,Unknown,Always On: WSFC AG integrity check failed for AG 'AG-SQLIPSN-DEV' with error 41044<c/> severity 16<c/> state 1.
08/13/2019 09:07:01,spid55,Unknown,Error: 19435<c/> Severity: 16<c/> State: 1.
08/13/2019 09:07:01,spid55,Unknown,The state of the local availability replica in availability group 'AG-SQLIPSN-DEV' has changed from 'RESOLVING_NORMAL' to 'NOT_AVAILABLE'.  The state changed because either the associated availability group has been deleted<c/> or the local availability replica has been removed from another SQL Server instance.  For more information<c/> see the SQL Server error log<c/> Windows Server Failover Clustering (WSFC) management console<c/> or WSFC log.
08/13/2019 09:06:01,spid55,Unknown,The state of the local availability replica in availability group 'AG-SQLIPSN-DEV' has changed from 'NOT_AVAILABLE' to 'RESOLVING_NORMAL'.  The state changed because the local availability replica is joining the availability group.  For more information<c/> see the SQL Server error log<c/> Windows Server Failover Clustering (WSFC) management console<c/> or WSFC log.
08/13/2019 09:04:39,spid15s,Unknown,Always On: The availability replica manager is waiting for the instance of SQL Server to allow client connections. This is an informational message only. No user action is required.
08/13/2019 09:04:39,spid15s,Unknown,Always On Availability Groups: Local Windows Server Failover Clustering node is online. This is an informational message only. No user action is required.

Here the logs from the Cluster :

EVENT ID : 1254 Error - Clustered role 'AG-SQLIPSN-DEV' has exceeded its failover threshold. It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state. No additional attempts will be made to bring the role online or fail it over to another node in the cluster. Please check the events associated with the failure. After the issues causing the failure are resolved the role can be brought online manually or the cluster may attempt to bring it online again after the restart delay period. EVENT ID : 1205 Error - The Cluster service failed to bring clustered role 'AG-SQLIPSN-DEV' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role. EVENT ID : 1069 Error - Cluster resource 'AG-SQLIPSN-DEV' of type 'SQL Server Availability Group' in clustered role 'AG-SQLIPSN-DEV' failed. Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

I am totally lost on why it doesn't work. My previous AlwaysOn configuration went fine without any issue and we did the same thing for this one...

The only thing I could think of is to begin the all process again (deleting everything -> DNS records, AD records, Quorum share , Cluster) and start again... But I am not sure it would work...

Hope anyone can help,

Best Regards,

Jon

Types(CPU/RAM type) of IaaS required for for replicas and primary node in a SQL AG?

$
0
0

Types(CPU/RAM type) of IaaS required for for replicas and primary node in a SQL AG?

For Basic SQL AG for SQL 2016 Standard version, the replica node (can be a less powerful IaaS- less CPU/RAM)?

For SQL AG for SQL 2016 Data Center version, can the replica nodes be less powerful the primary node within the same data center (zone)?   

If we use a distributed AG with SQL 2016 Data Center version, can the replicas be less powerful for the fowarder node at the DR site?   Can the other replicas both at the primary replicas in the primary data center and be less powerful (less RAM/CPU) than the primary node?


dsk

SQL Server Upgrade from SQL Server 2012 to SQL 2017 Failed

$
0
0

Hello, 

I am doing a SQL Server Upgrade from SQL 2012 to SQL 2017, while I did for two passive (node 3,4) it worked. Interestingly the instance where i setup prevent failback moved to the upgraded node and when i try to failover it was not failing over to the Node 1, Then  i was attempting installation for the node 1, the installer itself was not initializing, it was closing unexpectedly saying SQL Server setup Failed due windows update failure. From the Summary.txt, I found "Error Microsoft.sqlserver.configuration.setupextension.rundiscoveryaction failed". I am half done now, need to see the possible action. 

I am looking for possible solution for this, without thinking of the Uninstall and add node solution for the rest of the nodes. Kindly help advise if you come across this situation and how you fixed 

Thanks 



Thank you... MOMEN

Question related to Alwayson Availability groups

$
0
0

Hi Experts,

I have a doubt on Always on Availability groups. While trying to add(i.e. JOIN only)  a database to AG. It was throwing below error.



I tried to check the backup history of that database. There were 10 log backups taken to a network share since last FULL backup. Full backup was already restored with no recovery on secondary.

Now, instead of copying those 10 log backups , I took a fresh local differential backup and tried to restore it with no recovery option. restore was successful. However, when I tried to add the database to AG from Primary replica , it again throws the same above error message.  I took another differential backup, and tried restoring with no recovery and it back to AG. Still same error message.. yes I do agree there are some incoming connections to the database on Primary (even before adding to AG). Finally, I tried to take a fresh log backup on primary, restored with no recovery on secondary. now performed the JOIN only to AG. This way, I was able to add the database AG.

Now, my question is, Is there any difference in restoring Differential and log backup in this scenario ?
I just went for differential to avoid copying/restoring so many log backups. I have observed this in more than 2 times in our environment and that is the main reason why I want to know more about this. LOG BACKUP restore always worked.

Thank you
Sam

Viewing all 4532 articles
Browse latest View live