One of the many exciting new features of the VCSA 6.5 appliance is the builtin ability for high availability or HA out of the box. The HA configuration process creates two additional appliances that are a “passive” node and a “witness” node. So let’s take a look at how to configure VMware VCSA 6.5 HA and the steps involved.
How to Configure VMware VCSA 6.5 HA
To begin the installation of HA, we highlight the vCenter server appliance and navigate to Configuration and then vCenter HA. ***Note***
to setup vCenter HA for VCSA 6.5, you must have VCSA 6.5 appliance controlling the host that is running the VCSA 6.5 appliance. The vCenter VCSA 6.5 appliance must see the VCSA 6.5 VM running in its inventory for the process to work. As noted in the comments below, “When the vCenter Server is not self-managed the Advanced workflow can be used. If the vCenter Server that you’re enabled HA on is managed by another vCenter Server and they are part of the same SSO Domain, the Basic workflow can be used as well.”
Here I am simply going to choose Basic configuration. The Advanced configuration offers more control and manual configuration vs. the Basic configuration. However, in the end of both processes you should be able to have the same HA configuration setup.
With the Basic configuration, it will automatically add the additional vmnic to your VCSA appliance that you need to setup HA. So, all it needs you to do is set the IP address you want for the HA network interface.
Next, you setup the IP address for the Passive node and the Witness node. These don’t have to be the same subnet or layer 2 network as long as the IPs are routable between them. However, for my testing purposes, I am simply going to use additonal IP addresses in the same subnet.
In the deployment configuration, you see the configuration high level details for each node (active, passive, and witness). As you can see, you can Edit the configuration for the passive and witness nodes directly from this screen. Clicking Edit will open the configuration wizard for each of the nodes. As you can see below I have compatibility warnings as I have all three nodes stored on the same datastore which is a warning flag. Also, I am using a nested configuration for the test setup, so I have the HA network on the same network as the management network which is also a warning flag. The configuration lets us move past though with these warnings.
After reviewing the configuration for the nodes, the configuration is ready to complete.
After you hit the Finish button, the tasks kick off in vCenter. You can see the nodes being provisioned.
After a bit, the nodes are provisioned and the Deploy a vCenter HA cluster is completed.
Now you can see on all three nodes we have an Up status and the message tells us that all vCenter HA nodes are accessible and replication is enabled. Automatic failover protection is enabled. This lets us know that the cluster is working as expected.
Simulating a Failure
To simulate a failure, I simply powered off the active node in the vCenter HA configuration. It was neat to see that almost as soon as I could get my command prompt open and start pinging my active VCSA appliance address, it was already responding! The passive node had already taken over on the management interface.
A ***Note*** here – it did take a few minutes for the services to come back online as these have to be started after the failover takes place so wait a few minutes and you should be able to get back into the Web UI. After I logged back in, I see the following in the HA configuration. As expected the now passive node (active node is marked as passive once failover happens) is showing Down.
After letting things sit for a bit, I powered back up the once active node that is now passive and finally get the Up status on all three. Notice though we still have the replication failure might be occurring at the moment message. This is because replication hasn’t been able to sync back up yet.
After I waited just a few more minutes and refreshed, the statuses all went green with the replication message now stating it is enabled.
I wanted to discuss a few things I ran into in labbing the HA functionality that you may see in the real world and that some have already seen. First one I saw was this:
The operation is not allowed in the current state.
Failed to get management network information. Verify if management interface (NIC0) is configured correctly and is reachable.
As stated here on the DavidStamen blog it had to do with reverse DNS not being setup correctly. I had created the forward lookup records on my Ubuntu bind server, but the reverse entry had an error in it. Once this was resolved, it moved past this error.
Also, I ran into an issue with the HA Network between the VMs. I am running the new VCSA 6.5 appliance in a nested environment. Since I am tagging VLANs on the vSwitch of the parent VM where vnics are connected in the real ESXi environent, the nested environment, didn’t like the VLAN tag I was passing between the vSwitches of the nested ESXi install.
I noticed I was getting the following error setting up HA. It couldn’t connect to the peer node. As I began troubleshooting it was the node located on a different host which led me to realize the VLAN issue.
You can manually edit the HA status by pressing the Edit button in the HA configuration screen.
As you can see, you can set maintenance mode, Disable vCenter HA, and Remove vCenter HA.
Also, you can manually initiate a failover by pressing the Initiate Failover button. This will force a failover to the peer node.
The process on How to Configure VMware VCSA 6.5 HA is not too difficult. There were a few pesky issues in my test environment but mainly due to the nested environment and networking therein. However, for the most part the HA configuration went smoothly after the network issues were lined out. The failover seems to do what you expect as well. This is definitely an exciting new feature with VCSA 6.5 as we now have true HA configuration for the critical piece to vSphere – vCenter server.