sql server agent cant start
weird c# error in clustering
I am using SQL Server 2012 SE clustering on Hypervservers (virtual machines). SQL Server 2012 or prior Standard Editions supports 2 node clustering only. I noticed a strange issue on one node which upon failover is throwing a weird error for the applications which when failed back works fine.
The error on the webapplication page upon performing a failover is:
The client and server cannot communicate, because they do not possess a common algorithm.
To this node I am able to connect to SQL Server from SSMS and run tsql statements and also when I run profiler I am seeing connections from webservers as expected. Tried matching the TLS setting with the node where it never errors out but still no luck. This is happening only on this node. The other node which is currently active doesnt have any issues.
Looked into sqlserver error log, cluster event log and noticed nothing unusual.
So we want to evict that node and create a new one and add it to the cluster. Before proceeding further I would like to know if anyone faced a similar issue?
Thanks
Cannot remove database from secondary replica
Hi,
We had 3 nodes SQL Server AlwaysOn, which version are all 12.0.5532.0.
A: Primary
B: Secondary -- Async
C: Secondary --Async
Recently two big databases are in suspended status on server B(which was in sync mode), one database(database2) has accumulated about 2TB log transactions, resume the log will take days and seems it will never catch up. we decided to remove the database from AG and shrink log then add it back.
Run below on node A:
ALTER AVAILABILITY GROUP [AG]
REMOVE DATABASE [database2];
Node A and C looking good, database removed from AG, database2 on node c is restoring status, on node a is normal status.
however on node B, it is in "not syncing/in recovery" status, checking the sql server availability group, error log and availability group health events, there is no log re this database, also check the sys.databases where shows the replica_id of that database2 is not null, checked the task manager, there is heavy reads/writes going on.
I tried to run:
ALTER DATABASE [database2] SET HADR OFF;
GO
but I know it won't help, tried to set to singe user mode, but could not put lock on that database.
Then I restarted the instance on node B, the database2 is in recovery status, and still take days to complete.
Lastly, I have to kill sqlserver.exe, remove database2's data and log files, restart sql server instance, and drop the database. I will make my work done and add the database back to AG later...
My question is I have suspended the database or removed the database from availability group, why it still sticks on one instance and keep doing its own job?
Also found after restarting the sql server instance, I cannot expend "databases" in SSMS, which giving me following message:
"
Skipping the default startup of database 'database1' because the database belongs to an availability group (Group ID: 65539). The database will be started by the availability group. This is an informational message only. No user action is required.
"
which is fine I am not worried about the node B now but wondering how AG behaves in such situation.
how to understand HADR_AR_CRITICAL_SECTION_ENTRY
Hi,
I found many times today, if there are two large databases are in recovery status(trying to catch up in availability group), the other databases needs recovery will be waiting.
The full story is I am fixing a broken availability group node, initially one large database is doing recovery, I fixed several other small databases(remove from ag, apply log backups and add it back in AG), then I started another large database, so in total there are two large databases doing recovery, I tried to started the third one "alter database db3 set hadr off" which is a small database, and it just hang there, it is suspended due to wait type hadr_ar_critical_section_entry:
"
Occurs when an Always On DDL statement or Windows Server Failover Clustering command is waiting for exclusive read/write access to the
runtime state of the local replica of the associated availability group.
Applies
to: SQL Server 2012 through SQL Server 2016.
"
I am not sure how the logic works in the background, can anyone shed some lights? thanks!
Here are the server/version details:
Microsoft SQL Server 2014 (SP2) (KB3171021) - 12.0.5000.0 (X64)
50GB memory and 12 cores assigned to this instance.
Windows Server 2016 hyper-converged architecture and SQL HA?
Hi,
with Windows Server 2016, Microsoft promotes the hyper-converged architecture.
I want to use this approach for the tiny datacenter we have to setup in our office. I want to have 2 physical server in hyper-converged mode. because we do not have needs to have a bigger architecture, but I want a redundant one.
what about SQL in this case?
do I have to setup 2 VMs with a synchronous alwayson replication? or does the hyper-converged option is enough to offer the redundancy? (so the VM itself is clustered, no need for a special SQL setup)
I'm not able to find information about this setup for SQL and if its recommended, supported and what are the limitations.
for disaster recovery, I plan to use the Azure Site Recovery option. and here too, do I have to make the VM itself available or do I have to create an asynchronous alwayson setup?
Always on latency problem
Hi, I have two databases and one of them does not sync (high latency) on specific server. All have the same disk, memory, and so on. All in internal network. What can I do to identify why this occurs in this specific case? (He has no workload at present). It would not be the network, because the first DB has much more data than what is delaying. Thank you very much
(i try send a print, but do not work)
Availability Group Question
Currently have a total of six nodes in a single Windows cluster across 3 sites:
- Site 1 has 2 nodes with AG1 configured between them
- Site 2 has 2 nodes with AG2 configured between them
- Site 3 has 2 nodes with AG3 configured between them
They want to add a node in azure to the windows cluster and configure
- Instance 1 would host a replica for AG1
- Instance 2 would host a replica for AG2
- Instance 3 would host a replica for AG3
Would this configuration work ?
thanks
Peter
SQL Server 2016 w/ Windows Server 2016 Hyper-V Replica
SQL 2014 STD - DB mirroring across domains - Error Msg 1418 - network address can not be reached or does not exist
I am trying to setup SQL mirroring between domains, on SQL server 2014 standard (that we just bought).
I have tried the mirroring wizard, then these guides:
https://technet.microsoft.com/en-us/library/ms191140.aspx
http://www.kendalvandyke.com/2011/12/database-mirroring-in-windows-workgroup.html
(and I tested by dropping the firewall on both servers, to ensure it's not a firewall issue)
But all ends with the error
Msg 1418, Level 16, State 1, Line 1
The server network address "TCP://XXX" can not be reached or does not exist. Check the network address name and that the ports for the local and remote endpoints are operational.
during the last step of the wizard “start mirroring” or the code:
* Execute this against the Principal Instance.*/ALTERDATABASE MirrorDB SET PARTNER =‘TCP://<<your mirror server name here>>:5023’GO
I also tried running the SQL services as the local accounts, as used for the mirror services
Please advise
Tuf file in Logshipping
Why .Tuf file only creating when the secondary server database in stand by Mode.?
and Why it is not creating when the secondary server Database in Restoring mode?
May I know the Special reason ?
Move SQL Server Failover nodes with min down time
Hi, we have a few SQL Server 2012 Failover cluster nodes that needs to be move to another data center, what would be the best way to do this without bringing down the system for too long?
one of the setup is...
-VMware servers
-windows 2012
-windows clustering
-2 nodes (active/passive)
-SQL Server 2012 ENT
-Used for sharepoint( 150+ databases )
-EMC storage
SQL Server 2016 : Always On Feature not sync User Login, Job, and show primary and secondary
Hy Guys,
i want ask you a few questions about sql server 2016 feature always on. First of all, let me introduce the environment of mine.
1. VM Node A
- OS : Windows Server 2016
- SQL Server 2016 Enterprise
- IP : 10.xxx.xxx.141
2. VM Node B
- OS : Windows Server 2016
- SQL Server 2016 Enterprise
- IP : 10.xxx.xxx.142
Let say, i have 20 database, and i want to make such as load balancer, for example, first 10 databases primary in Node A, and secondary in Node B, BUT the last 10 databases primary is in Node B, and the secondary in Node A.
The easy way is all 20 databases using primary in Node A, it would just nedd one Availability Group, and one listener. BUT because one availability group contains all databases inside only can choose primary in Node A or Node B, we can't choose 10 databases in this availability group to choose primary in Node A, and other 10 databases in Node B, as i know.
In my case, i have created 2 Availability Groups, one Availability Group A contain first 10 databases and one listener, (10.xxx.xxx.147) and one Availability Group B contain last 10 databases and one listener (10.xxx.xxx.148), and both of them cannot using the same IP address, that's why i used different IP Address.
The problems are :
- User Login and jobs isn't synchronize between Node A and Node B, How to make is synchronize?
- If we login using USERLOGIN, Both of Nodes have the same list such as
databasename (synchronized), and we don't know which database is in Availability Group A or Availability Group B, we know it if we expand the database, there is error as like NOT ACCESSIBLE. How to make the list of database of instance just show the
Primary node and hide the database is in secondary node? and if user A just only granted to access one database, when the user A login in SSMS, it will show the one database that granted on him, is it possible?
yoga
Private Network After Configuring Cluster
Hi all,
I have installed SQL Server 2016 in cluster mode, but we did not have private network configured before.
Is it necessary? If yes, should I manually instruct the cluster service to use that for heartbeat?
Many thanks in advance!
Always On Clustering - Setup and Best Practice
Hi
I am very new to using Always On, currently we are using mirroring between our SQL servers.
One thing I'm having trouble getting my head around is the clustering of the servers, especially as our SQL servers are VM's.
My current setup has 3 Hyper-V hosts clustered for HA and the SQL VM's are sitting on these.
My question is, does Always On require a cluster setup on the SQL machines themselves? And if so how will this cause me any issues with my HA cluster on the Host machines?
Pete
How to implement transactional replication on Amazon RDS.
Currently we are using SQL server2008 R2 on stand alone machine but now we are migrating to Amazon RDS and facing issues while implementing SQL server replication (Transactional Replication). Please suggest any alternate for this.
Trace Flag 9567
As we no that
If we enable Trace Flag 9567 in SQL Server 2016 with automatic seeding process to enable compression of data stream.It really boost streaming process.
But at the same time increase the load on the server CPU utilization. Then what is the advantage of enabling Trace Flag 9567.
If CPU utilizaiton is more definitely we can get perofrmance hit.
SQL 2008 R2 service. Won't start on Cluster Node
Hello. I have SQL Server 2008 R2 running on a 2-node Win 2012 cluster. The SQL role, and all resources are running on the active "NODE B". The cluster uses shared NetApp Luns for data, logs, etc.
I recently applied Windows updates to inactive NODE A. I then tried to Pause NODE B and drain roles over to A. SQL role failed to start on NODE A, so I resumed NODE B and failed back SQL role to NODE B.
System event logs on NODE A indicate a logon failure. I possibly forgot to change the SQL Server service account using Configuration Manager on NODE A during our last password change outage.
As a separate issue, the transaction log for one of my databases on the active node NODE B had grown immense - 600 GB!! - due to our storage engineer not including it in the tran log backup policy. When I failed back to the original active NODE B, I noticed that database was in Recovering mode. Not sure if it was ALREADY in Recovering mode before I tried to failover to NODE A, or AS A RESULT OF the attempted failover.
So, two questions:
Am I correct in assuming that service account passwords must be changed in Configuration Manager separately on both cluster nodes? I'm guessing yes.
Assuming the database with 600 GB log was already in Recovering mode, what is the impact on cluster node failover when database is in that state? Assuming it was NOT already in recovering mode, could failover have caused it to go into recovering mode due to size of the tran log?
Thanks.
SQL Server 2016 Seeding
Hi,
I have enabled Automatic Seeding in existing AG. (This is only Test enronmnet)
I followed below procedure after creating database and took full backup:
1. On Primary Server
ALTER AVAILABILITY GROUP [AGG1] ADD DATABASE [teatdb];
GO
ALTER AVAILABILITY GROUP [AGG1] GRANT CREATE ANY DATABASE;
GO
2.On Secondary Server:
ALTER AVAILABILITY GROUP [AGG1] GRANT CREATE ANY DATABASE;
GO
In primary server the database successfuly added to AG but in sedondary server i didn't see that database and even i didn't find any errors or any kind of information in Error Log.
Am i missing anything, Please help me.
Manually Prepare a Secondary Database for an Availability Group (SQL Server) using copy only backup
Hi,
I need to add new replica to existing always on configuration, as my database size is in TB so want to use join only option to add new replica. I want to ask can I use copy only option with full backup and restore logs on this backup for preparing the secondary database for AG without disturbing the routine backup.
It is urgent would appreciate early help on this.
Humayun
Cluster resource in clustered service or application failed
We have 2 nodes SQL 2008 Server 2008 R2 cluster installed on Windows Server 2008 R2 W/SP1 Operating System.
I have the following issue:
I need to bring cluster resource manually onling frequntly plus I found the following error in cluster event:
Event: 1587
Cluster file server resource 'Resource Name' failed a health check. This was because some of its shared folders were inaccessible. Verify that the folders are accessible from clients. Additionally, confirm the state of the Server service on this cluster node using Server Manager and look for other events related to the Server service on this cluster node.
Event: 1205
The Cluster service failed to bring clustered service or application 'SQL Server (MSSQLSERVER)' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.
Event: 1069
Cluster resource 'Resource Name' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.
Abduljalil Abolzahab