Draft for deploying a high available envirement with WSFC and AlwaysOn

July 25, 2019, 2:19 am

≫ Next: Problem running Update-ClusterFunctionalLevel on Server 2019

≪ Previous: Clustered role 'xxxx-AG' has exceeded its failover threshold

Hello everybody,

for my bachelors thesis I'm currently looking for Solutions to create high available Environments for software applications.
In total I assume that I have three applications and a database. I read the articles on the WSFC and the AlwaysOn solutions and created the following draft. I would be glad if someone of you could tell if this solutions is possible or even if there any obvious mistakes because my subject normally is electric engineering. This work is a bit deeper in information technologies, for that I'm counting on your help.

Kind regards from Germany!

Edit: seems like I'm not able to upload pictures yet

↧

Problem running Update-ClusterFunctionalLevel on Server 2019

July 25, 2019, 3:36 am

≫ Next: Read Scale Availability Groups and Listener

≪ Previous: Draft for deploying a high available envirement with WSFC and AlwaysOn

I have in-place upgrade a 2 node SQL cluster (from Server 2016 Std. to Server 2019 Std.). The whole process worked as expected.

The SQL installation is not a Always On installation, it's just a regular active/passive installation.

Now I want to run Update-ClusterFunctionalLevel, but it is returning the following error:

Update-ClusterFunctionalLevel : Updating the cluster functional level failed.
    The system cannot find the file specified
At line:1 char:1+ Update-ClusterFunctionalLevel+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ CategoryInfo          : ObjectNotFound: (:) [Update-ClusterFunctionalLevel], ClusterCmdletEx
   ception+ FullyQualifiedErrorId : FileNotFound,Microsoft.FailoverClusters.PowerShell.UpdateClusterFunc
   tionalLevelCommand

I see these lines in the cluster log, when the "Update-ClusterFunctionalLevel" fails. It seems to be related to a certificate issue, but I can'f find the root cause.

0000112c.00001910::2019/07/25-10:09:30.858 INFO  Keyname not present for ClusterPKU2U, generating one
0000112c.00001910::2019/07/25-10:09:30.873 INFO  [Cert] Added new cert of type ClusterPKU2U to the store
0000112c.00001910::2019/07/25-10:09:30.887 WARN  mscs::GumAgent::ExecuteHandlerLocally: (-2147024894)' because of 'NCryptImportKey(certProv, 0, BCRYPT_RSAFULLPRIVATE_BLOB, &nameBufferDesc, certKey.Reference(), (PBYTE)c_data(keyData), (DWORD)keyData.size(), NCRYPT_OVERWRITE_KEY_FLAG | NCRYPT_DO_NOT_FINALIZE_FLAG | NCRYPT_SILENT_FLAG)'
0000112c.00001910::2019/07/25-10:09:30.887 WARN  [DM] Aborting group transaction 47:47:1719467+1
0000112c.00001910::2019/07/25-10:09:30.887 ERR   [CORE] mscs::ClusterCore::VersionUpgradePhaseTwo: (-2147024894)' because of 'Gum handler completed as failed'

I have posted the same question in the "High Availability (Clustering)" group, but they ask me to create it here as well

↧

Read Scale Availability Groups and Listener

July 26, 2019, 10:16 am

≫ Next: how to setup SQL AG from one region to another region?

≪ Previous: Problem running Update-ClusterFunctionalLevel on Server 2019

Is it possible to use a listener with a read scale availability group?

I have read on some sites you cannot use one but in others that you can.

If so what IP adress do you use for it? I have tried to assign a new IP in the subnet and also using the IP of the VM with the primary replica. I can create the listener but cannot connect to it.

I am interested in using the listener to only connect to the primary replica, I am not planning to use read only routing. I idea to use a read scale availability AG is to provide a limited degree of DR. I understand the listener would have to be redeployed if th AG was failed over.

↧

how to setup SQL AG from one region to another region?

July 27, 2019, 2:41 pm

≫ Next: Always-on related wait type

≪ Previous: Read Scale Availability Groups and Listener

how to setup/configure SQL AG from one region to another region?

We have a SQL AG in one region and we want to replicate to another SQL AG in another region.

I know that you cannot stretch a subnet across regions therefore you cannot place the SQL AG located at the DR site behind the same internal (basic SKU LB). If we have a SQL AG or a single SQL nodes how do we add the DR nodes to the same AG group as the primary site's SQL AG group?

Can we use a basic LB across regions or do we use a standard LB?

dsk

↧

Always-on related wait type

July 28, 2019, 7:26 pm

≫ Next: Always-on : Log Cache, Log pool, Log capture

≪ Previous: how to setup SQL AG from one region to another region?

Any good links to descript the the wait time related to always-on below ??

•HADR_SYNC_COMMIT

•HADR_CLUSAPI_CALL

•HADR_LOGCAPTURE_WAIT

•HADR_SYNCHRONIZING_THROTTLE

•WRITELOG

↧

Always-on : Log Cache, Log pool, Log capture

July 28, 2019, 8:33 pm

≫ Next: EXEC sp_server_diagnostics into better format

≪ Previous: Always-on related wait type

I am getting a bit confuse of the terms Log Cache, Log pool, Log capture .

1. Are these resize in SQL buffer memory ?

2. Log cache, as I know should be 60KB (fixed size) , what about Log pool and Log capture ?

3. If the second replica is down, Log pool and log capture will be increased? and also the performance counter log send queue increase??

↧

EXEC sp_server_diagnostics into better format

July 28, 2019, 8:29 pm

≫ Next: moving tempdb of FCI to local disk

≪ Previous: Always-on : Log Cache, Log pool, Log capture

How to get better format of sp_server_diagnostics xml output ?

↧

moving tempdb of FCI to local disk

July 26, 2019, 3:22 pm

≫ Next: AAG FAILOVER

≪ Previous: EXEC sp_server_diagnostics into better format

Trying to move sql FCI tempdb to local disk and it fails with the below message. I have tried before multiple times and always works ok for FCIs - physical and virtual machines.

This one is WMware vm running Windows2016 and sql2017. local disk defined on vm host and presented to VM as local disk.

So the question is - Why the engine needs this disk to be a shared - or rather where it gets this info since the the disk is indeed a local disk and that is the whole point of using it for tempdb in this case.

Msg 5184, Level 16, State 1, Line 2 Cannot use file 'T:\TempDB\Inst1\tempdb01.mdf' for clustered server. Only formatted files on which the cluster resource of the server has a dependency can be used. Either the disk resource containing the file is not present in the cluster group or the cluster resource of the Sql Server does not have a dependency on it.

Thanks and appreciate any help or leads on this one.

↧

AAG FAILOVER

July 28, 2019, 1:22 am

≫ Next: Distributed Availability Group built on AG with one replica ??? Transaction Log Space Reuse Issue ?

≪ Previous: moving tempdb of FCI to local disk

HI,

If in aag failover happens from primary to secondary , secondary will remain read only untill unless we manually change read write ?

Thanks

↧

Distributed Availability Group built on AG with one replica ??? Transaction Log Space Reuse Issue ?

July 30, 2019, 3:54 am

≫ Next: Problem with Enable Availability Group in Secondary Database (Clustering)

≪ Previous: AAG FAILOVER

Hello

We think to create distributed availability group (DAG) which will have AG1 with 2 replicas on one side and AG2 on other side with one replica (forwarder) only.

But as I understand if availability group has only one primary/forwarder replica (in our case AG2 ), the configuration will not allow transactions log truncation and in result Reusing Log Space of log files.
Am I understand right ?
if Yes, is it possible to escape the problem without changing availability groups configurations (replica counts) ?

Thank You

↧

Problem with Enable Availability Group in Secondary Database (Clustering)

July 30, 2019, 4:31 am

≫ Next: Always ON uses windows cluster IP to connect DB !

≪ Previous: Distributed Availability Group built on AG with one replica ??? Transaction Log Space Reuse Issue ?

I have existing Cluster Existing Primary Server

I needs to add one new secondary server for the Existing Cluster server, I have added node to the Existing cluster,

I have created Mirroring Endpoint and , Modify the Availability Group for include secondary server details with Endpoint URL

Taken Full Back up the Existing Primary Server Database, Taken Log Back up ,

Restore the backup in the Secondary Server, Restore the log backup in the Secondary server

Some of the databases able to Enable the Availability Group and Sync the database from Primary to Secondary

But huge size database restore giving the below issue.

ALTER DATABASE [WWPGT] SET HADR AVAILABILITY GROUP = RAG01;
GO

Error

The remote copy of database "WWPGT" has not been rolled forward to a point in time that is encompassed in the local copy of the database log.

Some time, I am getting an error, while restore the Transaction Log backup

Error

Please help me to rectify the error

Primary server and Secondary server having same edition and SQL Server 2016, and Service pack two, but minor differences in the release

Secondary server

================

Microsoft SQL Server 2016 (SP2) (KB4052908) - 13.0.5026.0 (X64)
Mar 18 2018 09:11:49
Copyright (c) Microsoft Corporation
Enterprise Edition (64-bit) on Windows Server 2016 Datacenter 10.0 <X64> (Build 14393: ) (Hypervisor)

Primary Server

Microsoft SQL Server 2016 (SP2-CU3) (KB4458871) - 13.0.5216.0 (X64)
Sep 13 2018 22:16:01
Copyright (c) Microsoft Corporation
Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2016 Datacenter 10.0 <X64> (Build 14393: ) (Hypervisor)

↧

Always ON uses windows cluster IP to connect DB !

July 30, 2019, 7:08 am

≫ Next: Automatic Seeding Failures

≪ Previous: Problem with Enable Availability Group in Secondary Database (Clustering)

Hi Experts,

We have Always ON configured between two DB servers which are located in two different sites.

Since we don't have common segments between WAN's,we are using multi subnet cluster here. Up to this point we are pretty clear.

But application team is establishing connections to DB using windows cluster name.So we are not clear how this is possible ? we have checked DNS alias ,SQL alias and SQL TCP/IP protocol configuration's.All seems to be normal.

So kindly advise,in what all cases app can establish connection to DB using windows cluster name ? Just FYI..we manually checked from SQL side,SQL is connecting using windows cluster name.

Regards,

Naren poosa

↧

Automatic Seeding Failures

August 29, 2018, 5:32 am

≫ Next: SQL server 2016 Replication issue - The process could not execute 'sp_replcmds' - MSSQL_REPL20011 and MSSQL_REPL22037

≪ Previous: Always ON uses windows cluster IP to connect DB !

We are running a two Server Always On HADR system with enabled Automatic Seeding. Both SQL Servers are Microsoft SQL Server Enterprise (64-bit) with Build 14.0.3029.16 and both are running on Windows Server 2016 Standard (10.0). Assigned memory is 4096 MB to 432128 MB. Disk space is enough on both servers and performance Tuning options like "Lock pages in memory" are set.

From time to time the automatic Seeding process will not start. In the DMV sys.dm_hadr_automatic_seeding the current_state is CHECK_IF_SEEDING_NEEDED and after a couple of seconds it changes to FAILED. The failure_state_desc then shows "Seeding Check Message Timeout".

The last database we had this problem was just 8 GB. The Problem can be easily solved by removing the DB from the Availability Group and adding it again. Then it is working without problems.

Where can I enhance this timeout?

What could delay the check if seeding needed?

Do you need more information?

Thanks for the help.

Kind Regards

Dominic

↧

SQL server 2016 Replication issue - The process could not execute 'sp_replcmds' - MSSQL_REPL20011 and MSSQL_REPL22037

July 30, 2019, 11:58 pm

≫ Next: Do we have any official documentation of newly introduced dpt_entry_lock (in SQL server 2016)

≪ Previous: Automatic Seeding Failures

Hi All,

i am facing issue with Transnational replication. there is NO errors in Distributor to Subscriber. seems issue in Publisher to Distributor(Not sure). I can see below errors in REPL monitor,

The process could not execute 'sp_replcmds' on (Source: MSSQL_REPL, Error number: MSSQL_REPL20011)
Get help: http://help/MSSQL_REPL20011

The statement has been terminated. (Source: MSSQLServer, Error number: 3621)
Get help: http://help/3621

The process could not execute 'sp_replcmds' on . (Source: MSSQL_REPL, Error number: MSSQL_REPL22037)
Get help: http://help/MSSQL_REPL22037

in Addition to that below is the error in MSRepl_Errors table,

The process could not execute 'sp_replcmds' on 'PUBLISHER'.
Query timeout expired

Please assist on this issue.

Regards,

SJB,

↧

Do we have any official documentation of newly introduced dpt_entry_lock (in SQL server 2016)

August 4, 2019, 1:14 am

≫ Next: Automating Windows Patching with AlwaysOn Manual Failover w/h Asynchronous

≪ Previous: SQL server 2016 Replication issue - The process could not execute 'sp_replcmds' - MSSQL_REPL20011 and MSSQL_REPL22037

dpt_entry_lock affects the overall synchronization process of Always on AG in SQL 2016.

Regards, Ashif Shaikh

↧

Automating Windows Patching with AlwaysOn Manual Failover w/h Asynchronous

November 1, 2017, 11:53 am

≫ Next: SQL database restore got faster, but why?

≪ Previous: Do we have any official documentation of newly introduced dpt_entry_lock (in SQL server 2016)

We have an AlwaysOn High Availability Group set for Asynchronous/Manual failover (it's required by client). It is SQL Server 2016 running on Windows Server 2012 R2. It is set up as a 2 node Windows Failover Cluster. We are required to do a monthly update for Windows Update patching. As you know, the manual failover requires human intervention using Microsoft's Rolling Update solution. We'd like to automate those failover steps with power shell, code, batch files, or some other combination of implementations. Do you think this is feasible? Is it best just to do it manually (using SSMS and Failover Cluster Manager) because there are too many "gotchas"? I didn't want to start down the road of writing some program to do this if someone has already been down this road and ran into too many problems. Thanks!

↧

SQL database restore got faster, but why?

August 5, 2019, 4:30 pm

≫ Next: Is Always On Failover Cluster Instance considered legacy?

≪ Previous: Automating Windows Patching with AlwaysOn Manual Failover w/h Asynchronous

I ran into a performance issue with restoring a very large database over the weekend

a colleague suggested I add the service account (domain user running sql server) to the security policy as shown below

After a restart of the SQL Service the performance of the restore was improved by more than 5 times

Question i want to ask is - why did that happen?

Regards, RayPak

↧

Is Always On Failover Cluster Instance considered legacy?

August 6, 2019, 3:17 pm

≫ Next: Change SCCM databse to High AG SQL Listener

≪ Previous: SQL database restore got faster, but why?

Is Always On FCI considered legacy technology to be avoided in favor of AG? Or is it still robust option for HA that will be supported by MS well into the future (unlike the now deprecated mirroring feature)? I realize that selection of HADR technology depends on our particular needs, and FCI would suit part of our needs. But we don't want to implement technology that might be deprecated in a few years. We're considering a hybrid approach to HADR -> 2-node FCI (with AG primary replica)+AG secondary replica in Azure. That seems like a pretty conventional approach that will get us all the HADR features we need. The biggest benefit there is shared storage of the FCI and thus avoiding cost of dedicated storage for multiple on-prem AG secondaries.

↧

Change SCCM databse to High AG SQL Listener

September 14, 2018, 8:52 pm

≫ Next: Listener IP does not get updated on all DNS Servers post failover/failback of Multi-Subnet Avilablity Group

≪ Previous: Is Always On Failover Cluster Instance considered legacy?

I have created Always on with SQL1 as Primary and SQL2 as Secondary and in Asynchronous mode. SQLLST is the virtual name for the listener.

I tried to change the SCCM DB from SQL 1 to SQLLST but it failed.

Is it coz the Listener has only one Primary Replica from SQL 1? I mean is it possible to point to listener in my case?

↧

Listener IP does not get updated on all DNS Servers post failover/failback of Multi-Subnet Avilablity Group

August 6, 2019, 11:53 pm

≫ Next: "Select backup devices" hangs after clicking Add button

≪ Previous: Change SCCM databse to High AG SQL Listener

We got SQL 2014 2-node Multi Subnet AlwaysOn Availability Group with Auto Failover is configured in our environment. We have set the Cluster parameters for AAG as RegisterAllProvidersIP=0 & HostRecordTTL=5. We got applications and users in both data centers and also in 3rd in location connecting to the SQL via listener.

The issue we are experiencing is that after AAG failover, the listener IP gets updated in the DNS Server on the side where its failed over to but the DNS Servers on other sites does not get updated. This means the users on these other sites are unable to connect to listener. We ran NSLookup on the other 2 sites after failover and the Listener name is still pointing to old IP.

I have checked with our Domain Admin and he has advised that at we got separate DNS Servers at each of the 3 sites DNS Service is running on Domain Controller which gets synced every 15 mins as part of AD sync. He said the 15 min sync scheduled for AD is the standard MS recommended.

Has anyone experienced this issue and if there is any additional configuration/setup we need to do ? I read an article or in some forum where someone recommend to create a job which executes a script based to manually update the DNS with required IP on all DNS Servers. There were no instructions on how to achieve it.

Any feedback or advise on how to address this issue will be really helpful.

↧