Windows Server 2012 R2 with SQL Server 2014 Failover Clustered Instance #Step-By-Step #AlwaysOn AvailabilityGroups What can go Wrong! Part 1

 

There are a lot of good blog post about how to setup your Availability group, in two blog post I will try to break the basic setup and will show you what you should not do in your production environment. Just because you can does not mean I should do this.

This blog post will also show you most common errors and how to fix them and where to find the errors, but in the end you will have a working two node cluster and one Availability group

First how to setup an Availability group  to make things more complex there are multiple instances, see how they look Naming convention is really important when you do complex configurations. an typo is quickly made!

 

SQL Server 2014 Failover Clustered Instance (FCI)

Deploying AlwaysOn Availability Groups requires a Windows Server Failover Clustering (WSFC) cluster. To be enabled for AlwaysOn Availability Groups, an instance of SQL Server must reside on a WSFC node, and the WSFC cluster and node must be online. Furthermore, each availability replica of a given availability group must reside on a different node of the same WSFC cluster. The only exception is that while being migrated to another WSFC cluster, an availability group can temporarily straddle two clusters.

AlwaysOn Availability Groups relies on the Windows Failover Clustering (WSFC) cluster to monitor and manage the current roles of the availability replicas that belong to a given availability group and to determine how a failover event affects the availability replicas. A WSFC resource group is created for every availability group that you create. The WSFC cluster monitors this resource group to evaluate the health of the primary replica.

The quorum for AlwaysOn Availability Groups is based on all nodes in the WSFC cluster regardless of whether a given cluster node hosts any availability replicas. In contrast to database mirroring, there is no witness role in AlwaysOn Availability Groups.

The overall health of a WSFC cluster is determined by the votes of quorum of nodes in the cluster. If the WSFC cluster goes offline because of an unplanned disaster, or due to a persistent hardware or communications failure, manual administrative intervention is required. A Windows Server or WSFC cluster administrator will need to force a quorum and then bring the surviving cluster nodes back online in a non-fault-tolerant configuration.

Primary on an FCI with a replica on a different FCI

Windows Server 2012 R2 Failover Cluster with SQL Server 2014 Failover Clustered Instance (FCI) #Step-By-Step #AlwaysOn Availability Groups image

I have a lot of SQL instances and this all runs on a two node Cluster and not all instances are installed on both nodes to trick the installation and to show you the errors you can expect.

image image imageimage

Enabling the AlwaysOn and you can see the Difference the new AG Wizard is not grayed out any more .

Windows Server 2012 R2 Failover Cluster with SQL Server 2014 Failover Clustered Instance (FCI) #Step-By-Step #AlwaysOn Availability Groupsimageimage

Starting the Wizard  and on system Databases it wil not work AG will only work on your own DB !

image You must make a full backup of your DB before you start ( this is always handy )

Now we can add a new replica Check the Server and as it is a Cluster you can not set it to automatic failover.

image image

 

 

We do setup a Data share for the replication

image

Now that we have completed the wizard we do the validation and go for the finish.

image

An Error ? checking the location ?? eh what should the DB be on the same location ?? not all my SQL server Cluster are the same and are not using all the drive letters. and as I choose to do this on the same cluster ( not supported ) I can not give the other instance the same drive letter. but hat if I had an other cluster and even then I did not have the same Drive letter. Is there a wizard bypass some where.  Wizards are nice If you have a default installation. If not Plan B.

image

TITLE: Microsoft SQL Server Management Studio
——————————

Checking for compatibility of the database file location on the secondary replica resulted in an error. (Microsoft.SqlServer.Management.HadrTasks)

——————————
ADDITIONAL INFORMATION:

The following folder locations do not exist on the server instance that hosts secondary replica MVPSQL201402\SQL2:
i:\MSSQL11.SQL001\MSSQL\DATA;
(Microsoft.SqlServer.Management.HadrTasks)

So placing this on a CSV SQL server 2014

Well if drive letters is an issue SQL Server 2014 can store the DB on a CSV so no more drive letters.

And It passed the Validation that is Great.

Great thinking but.. the CSV is connected to all SQL servers So the next error is logical.

image

Yes the DB is already there.. what now ?

Manually Creating an Availability group for a SQL Server 2014 FCI

This sounds great but where to start ? should I bing It ? Let Me Bing That For You!

Well I Create it with a SQL script ( I’m no SQL master  ) So things can be different.

However you can run this in a SQL CMD but here I do this step by Step.

I have My SQL Availibility group name, My DB name,IP, Servers

 

image

CREATE AVAILABILITY GROUP SQL001AG04
WITH (AUTOMATED_BACKUP_PREFERENCE = SECONDARY)
FOR DATABASE AG04 –, …
REPLICA ON — primary:
N’MVPSQL201401\sql001′ WITH (ENDPOINT_URL = N’TCP://MVPSQL201401.mvp.local:5023′,
FAILOVER_MODE = MANUAL, AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
BACKUP_PRIORITY = 50, SECONDARY_ROLE(ALLOW_CONNECTIONS = NO)),
— secondary:
N’MVPSQL201402\SQL2′ WITH (ENDPOINT_URL = N’TCP://MVPSQL201402.mvp.local:5022′,
FAILOVER_MODE = MANUAL, AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
BACKUP_PRIORITY = 50, SECONDARY_ROLE(ALLOW_CONNECTIONS = NO));

ALTER AVAILABILITY GROUP SQL001AG04
ADD LISTENER N’MVPLST01′
(WITH IP ((N’10.255.255.69′, N’255.255.255.0′)), PORT=1433);

BACKUP DATABASE AG04 TO DISK = ‘\\mvpfsw01\SQLAG04\AG04.bak’
WITH INIT, COPY_ONLY, COMPRESSION;

BACKUP LOG AG04 TO DISK = ‘\\mvpfsw01\SQLAG04\AG04.trn’
WITH INIT, COMPRESSION;

image

 

 

 

 

 

 

 

Now we go to the replica server and run the script below.

—————- Run this on The Replica Server!!!!!!!

 

ALTER AVAILABILITY GROUP SQL001AG04 JOIN;

RESTORE DATABASE AG04 FROM DISK = ‘\\mvpfsw01\SQLAG04\AG04.bak’
WITH REPLACE, NORECOVERY, NOUNLOAD,
MOVE ‘AG04’ TO ‘E:\MSSQL11.SQL001\AG04.mdf’,
MOVE ‘AG04_log’  TO ‘E:\MSSQL11.SQL001\AG04_log.ldf’;

RESTORE LOG AG04 FROM DISK = ‘\\mvpfsw01\SQLAG04\AG04.trn’
WITH NORECOVERY, NOUNLOAD;

ALTER DATABASE AG04 SET HADR AVAILABILITY GROUP = SQL001AG04

 

But as You can see in the screen shot it is not working the secondary server is down.

the following error is showing :  The connection to the primary replica is not active.  The command cannot be processed.

Message
A connection timeout has occurred while attempting to establish a connection to availability replica ‘MVPSQL201401\sql001′ with id [F82BBD94-4F04-4B0A-8B75-28A0899F240C]. Either a networking or firewall issue exists, or the endpoint address provided for the replica is not the database mirroring endpoint of the host server instance.

 

Ok I did turnoff all the Firewall, checked the network set permissions now what.

Think :  I have two cluster nodes both are using SQL in the script what do they have in common.

ENDPOINT_URL = N’TCP://MVPSQL201401.mvp.local:5022’

So changed it from 5022 to 5023 and it work like a charm

In the next post I will explain how to check this and how to change thisWinking smile

 

image

 

 

 

 

 

 

 

 

SO basically it is better to use the script that the wizard well it depends For now in the demo environment running on two different disks it is better and It would be better if the wizard ask you about drive letters or storage locations.

But manually you have more control about the setup and if something fails you can fixit before you go further.  But also you have to think about a lot of issues Winking smile

Next will be Part 2

More Errors and more fixes on   SQL Server 2014 Failover Clustered Instance (FCI) with Step-By-Step #AlwaysOn Availability Groups #winserv #FCI

Author: Robert Smit [MVP]

Robert Smit is Senior Technical Evangelist and is a current Microsoft MVP in Clustering as of 2009. Robert has over 20 years experience in IT with experience in the educational, health-care and finance industries. Robert’s past IT experience in the trenches of IT gives him the knowledge and insight that allows him to communicate effectively with IT professionals who are trying to address real concerns around business continuity, disaster recovery and regulatory compliance issues. Robert holds the following certifications: MCT - Microsoft Certified Trainer, MCTS - Windows Server Virtualization, MCSE, MCSA and MCPS. He is an active participant in the Microsoft newsgroup community and is currently focused on Hyper-V, Failover Clustering, SQL Server, Azure and all things related to Cloud Computing and Infrastructure Optimalization. Follow Robert on Twitter @ClusterMVP Or follow his blog https://robertsmit.wordpress.com Linkedin Profile Http://nl.linkedin.com/in/robertsmit Robert is also capable of transferring his knowledge to others which is a rare feature in the field of IT. He makes a point of not only solving issues but also of giving on the job training of his colleagues. A customer says " Robert has been a big influence on our technical staff and I have to come to know him as a brilliant specialist concerning Microsoft Products. He was Capable with his in-depth knowledge of Microsoft products to troubleshoot problems and develop our infrastructure to a higher level. I would certainly hire him again in the future. " Details of the Recommendation: "I have been coordinating with Robert implementing a very complex system. Although he was primarily a Microsoft infrastructure specialist; he was able to understand and debug .Net based complext Windows applications and websites. His input to improve performance of applications proved very helpful for the success of our project

16 thoughts on “Windows Server 2012 R2 with SQL Server 2014 Failover Clustered Instance #Step-By-Step #AlwaysOn AvailabilityGroups What can go Wrong! Part 1”

  1. what happens if you have a shared storage and the instance its-self got failed over rather than the database ?

  2. Dear,

    I am getting the following error “The following folder locations do not exist on the server instance that hosts secondary replica” . In my case on primary server, my SQL location is at E drive and on Secondary server its on H drive. but the error showing the wrong path.

  3. Yes this is common best way is to create the path in front. that is the source path must be the same as the destination even the drive letter.
    this is by design. the only way you can change this is creating it by TSQL, it is also on my blog

  4. when you failover what about the sql ids, will they orphan out how do you fix this problem

  5. this is indeed an issue. But with a Clustered FCI you connect always to the same Server. But one important item your app must be ready for alwaysOn. and often there is no automaticly failover but only a manual failover just to control the SQL load.

  6. Hi Robert
    Thanks for a great blog. Have you been experience problems browsing files in a SQL 2014 CSV eg when attaching or backing dbs with the SSMS gui? It trows me an access error even all Security are set correct. I would bet on bug..

    Brgds Stig

  7. Thanks for reply, it is the same on the owner of CSV instance/volumes. I can see that there are others that have the same problems when browsing for doing backup. I will Google more and hope I find a solution…

  8. Hi Robert,
    Thanks for your blog, after changing one of ENDPOINT_URL from 5022 to 5023, the error message “The connection to the primary replica is not active. The command cannot be processed.” is still showed, I do not the reason:(

  9. Hi Chris,
    are you sure the port is free and not used by something else 5024 is also an option but be aware if you created the AG then when you run it for the second time the items are already in place and need to deleted manually. in ht eblog post are all the steps I did. Let me know IF you can fix this or try starting from the beginning delete the AG and al the listeners in the AD

  10. Thanks for your reply:) , I did so, but it seemed that it was not the endpoint_url port issue, because I tried many different ports.
    And I tried to look into SQL Server logs, the useful log was just the same with you 😦
    Actually, there are three nodes in the Cluster and two instances in the AG, one instance is a SQL14 FCI installed on CSV, the other is a SQL14 standalone instance installed on Local disk. when I setup the dry-run environment, it succeeded, i did not encounter such problem, but at that time I install the standalone instance on system drive ( drive C).
    Now, I am setting up the formal environment, and the difference is just that the standalone SQL14 instance need to be installed in Non-system drive (installed on drive D). Then I encountered such problem –“The connection to the primary replica is not active. The command cannot be processed.”
    I did turnoff all the Firewall, tried to different endpoint_url ports, but it still did not work.
    The secondary replica machine could ping the primary replica successfully. Now I want to now how to checked the network set permissions because I did no perform this step.

  11. Keep in mind that the Disk Config needs to be the same on both AlwaysOn machines. If you are using different drive letters then you need to create the AG with TSQL. the script lines are in the blog post.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: