VMware vCenter Server High Availability (VCHA)

Recently, I got into a discussion with my colleague regarding vCenter Server High Availability (VCHA) and a good discussion on the area where VCHA could be of use.

Before we start, I like to summarize the few questions that always got asked during my course of work and as well when I teach vSphere Install, Configure and Manage course.

What is VCHA use for?
It is really meant for local site availability. Where a lost of vCenter Server can create an outage to other management components where vSphere HA RTO is not sufficient or vSphere HA is not possible to bring up the vCenter Server.

Can VCHA be used in a stretched cluster setup and how should we plan to place the nodes?
Yes. You will definitely have two sites in a stretched setup. And if you like to have VCHA implement in such a setup, you will need to have a 3rd site or minimal a separate cluster at the passive node site. Typically you will have active node at site 1, and passive node with witness node at site 2 where witness node is on a separate cluster at minimal or a 3rd site recommended. In the case, where active node lost connection to witness node, the passive will be promoted to active node.

What is the requirements for VCHA?
VCHA is using a continuous replication which also means it require a some high requirement. A minimum is 1G with 10ms RTT. You will also to separate network for VCHA network between active, passive and witness node where it is must not be part of management network and not have a default gateway. Reference: 1, 2

So what are the failure scenario? Reference: 1

  1. Active node is isolated from Witness node.
    Failover happens. Passive node become Active Node. Degrade state till passive node is resume.
  2. Passive isolated from Witness node.
    VCHA in degraded state. No failover.
  3. Witness node isolated from active and passive node.
    VCHA in degrade state. No failover.

What configuration is possible on VCHA setup?
You can refer to this doc.

How can we monitor VCHA?
You can setup alarms that are available such as when the Database change from sync to async. Refer to doc.


The use of VCHA has becoming lesser since we vSphere HA. The only reason is to ensure any component failure of vCenter Server where it cannot be recovered, that is where VCHA is highly usable. This can happen where you want to prevent storage failure where Active node is on one storage and Passive is on another. In the scenario of storage failure for active node, the passive node will ensure vCenter Server is available.

If you are leveraging vSphere HA together with VCHA, you are double protecting the vCenter Server and definitely recommended. In fact, using vSphere HA, you should always set to ensure vCenter Server to be high priority recovery so to ensure if resources are not optimize, you have a higher probability of bring up vCenter Server. Upgrade to vSphere 7.0 Update 1 or at least vCenter Server 7.0 Update 1 (where ESXi are in a version supported by vCenter Server 7.0 Update 1), where DRS service and vSphere HA are made available when vCenter server is not present via the vCLS. More from KB.

However, if your RPO time is not that crucial such as a failure requirement, then a typical vCenter Server native backup can come into the picture. Doing a new deployment of vCenter Server and restoring from the backup set. However, this does not restore the vDS configuration. You will also need to backup your configure of vDS if you are using that. And restore it's configuration after restoring from vCenter Server backup set. You can also use a PowerShell script to do a schedule backup as shown here.

Comments

Popular posts from this blog

Why VMware or Why Not after Broadcom?

VMware by Broadcom, A New Chapter Forward

VMware vExpert 2024 Application is Now Open!