Tuesday, 12 January 2016

DAG: Doesn’t match the configured witness server

If there are problems with a Database Availability Group (DAG) witness server in Exchange 2013 or 2016, you may run into the warning below where it seems to suggest that the actual configuration of the DAG doesn’t match what’s in Active Directory:

WARNING: The witness server and directory currently in use by database availability group 'DAG01' doesn't match the configured primary or alternate witness server. This may be due to Active Directory replication latency. If this condition persists, please use the Set-DatabaseAvailabilityGroup cmdlet to correct the configuration.


So, let’s go ahead and troubleshoot this. 

When checking the Directory Services logs, DCDIAG outputs etc, we can confirm that we have no AD replication issues. I also used the -DomainController parameter to check the DAG configuration against different domain controllers. This finds no issues - all domain controllers have the same configuration.

Configure witness server

The next thing to try is to re-configure the witness server for the DAG however now we get a different error: the device is not ready.

Set-DatabaseAvailabilityGroup -Identity DAG01 -WitnessServer LITFS01 -WitnessDirectory C:\DAG01-Witness


The full error is below:

There was a problem changing the quorum model for database availability group DAG01. Error: An error occurred while
attempting a cluster operation. Error: Cluster API failed:
"ClusterResourceControl(controlcode=CLUSCTL_RESOURCE_SET_PRIVATE_PROPERTIES) failed with 0x15. Error: The device is
not ready"
    + CategoryInfo          : InvalidArgument: (:) [Set-DatabaseAvailabilityGroup], DagTaskProblemChangingQuorumExcept
    + FullyQualifiedErrorId : [Server=LITEX01,RequestId=51d9f30f-3816-4ead-aa8e-1553b307f010,TimeStamp=05/01/2016 20:0
   9:31] [FailureCategory=Cmdlet-DagTaskProblemChangingQuorumException] E4283BF4,Microsoft.Exchange.Management.System
    + PSComputerName        : litex01.litwareinc.com

So that’s a pretty non-descriptive error. Our DAG is online, all servers are responding and the cluster service is started on all nodes in the DAG and all required ports are open on our firewalls.

Check DAG quorum

The next thing to check is the cluster quorum model. To do this, run the below command:

Get-ClusterQuorum | Fl *


Now this is interesting because ths is wrong. We have a two node DAG with a witness server so we should be using the Node and File Share Majority quorum model as configured automatically by the new Database Availability Group wizard.

Validate DAG Cluster

For more information on how to validate a DAG cluster, see here. When we validate the cluster, we get a problem where there are no resources in the Cluster Group:

The group does not contain any resources


We also see that there is no witness configured:

The cluster is not configured with a quorum witness. As a best practice, configure a quorum witness to help achieve the highest availability of the cluster.


Check File Share Witness

We’ll look at using the Set-ClusterQuorum cmdlet to re-set the quorum model to Node and File Share Majority but it’s not expected that this will create the witness share for us so let’s go check that the file share witness settings are correct and correct them if needed:

  • Check that the folder is present in the location: C:\DAG01-witness on LITFS01
  • Check that the folder is shared as <DAGName>.domainname.com (DAG01.litwareinc.com) in our case
  • Check that the Exchange Trusted Subsystem has full share permissions
  • Check that the administrators group has full control NTFS permissions on the folder

Use Set-ClusterQuorum to change the quorum model

Once done, we can use Set-ClusterQuorum to change the cluster quorum model and we must specify the witness UNC path when we do this:

Set-ClusterQuorum -NodeAndFileShareMajority \\litfs01\DAG01.litwareinc.com


Use Set-DatabaseAvailabilityGroup to set the witness server

To do this, run the command below to set our DAG, DAG01 to use LITFS01 as a witness server and to use the folder C:\DAG01-Witness on this server:

Set-DatabaseAvailabilityGroup -Identity DAG01 -WitnessServer LITFS01 -WitnessDirectory C:\DAG01-Witness


Confirm DAG Witness and Quorum Configuration

We can now check whether our issues are resolved by re-validating the cluster which shows no further issues with the cluster group which is now online and contains valid resources:


Below we can see that our quorum configuration is now valid, we have three online voters in our cluster and this includes the file share witness:


Confirm DAG Witness server settings

To do this, use the Get-DatabaseAvailabilityGroup cmdlet as below:

Get-DatabaseAvailabilityGroup -Status | fl *wit*


We now see that there are no further warnings.

We can also confirm our cluster quorum settings:

Get-ClusterQuorum | fl *


Our quorum model is now Node and File Share Majority as it should be.

We’ve confirmed the DAG configuration and cluster configuration are as is expected for a two node DAG and we can mark this issue as resolved.

No comments:

Post a Comment