Q. Where are the primary nodes placed and how many of them? And what do they do?
A. The first 5 hosts will be designated as primary nodes. The primary nodes maintain/replicate cluster state and initiate failover actions.
Q. When will the primary nodes possibly change?
A. If a primary node is disconnected or removed from a cluster or put in maintenance mode and whenever reconfigured for HA.
Q. How does the communication occur between various nodes?
A. Primary nodes send heartbeats to secondary and primary nodes, where as secondary nodes will send heartbeats only to the primary nodes only and this happens every 1 second.
Q. Is it necessary to change the default heartbeat interval of 1 second at das.failuredetectioninterval?
A. There isn’t any reason that I found to change the default heartbeat interval.
Q. Where can we find which hosts are primary nodes?
A. cat /var/log/vmware/aam/aam_config_util_listprimaries.log or /opt/vmware/aam/bin/cli – AAM> ln à also verify /var/log/vmware/aam/aam_config_util_addnode.log to see all the steps for adding a host to a HA cluster
Q. Can the primary nodes be set manually using command line? If so is that supported?
A. /opt/vmware/aam/bin/cli – AAM> promotenode or demotenode and it is not supported.
Q. How many Active primary or fail-over coordinators exist and what is the main function of it?
A. There will be only one Active primary node and is responsible to restart the VMs on primary and secondary nodes, when two hosts fail then restart VMs of the first failed and then the second, decide where to restart VMs, keep track of failed attempts, determine when it is appropriate to keep trying to restart the VMs.
Q. Which primary node becomes the “active primary or fail-over coordinator”? Is there a criteria used in the selection?
A. By default, the first primary node becomes the fail-over coordinator, however after that the others are selected on a random basis.
Q. What happens when all the primary nodes go down?
A. There should be at least one primary node at all times for HA to work if not no HA initiated restart of VMs will take place. This is the reason why you can only have 4 host failures when configuring HA.
Q. What is Host Monitoring Status?
A. After you create a cluster, enable Host Monitoring Status so that VMware HA can monitor heartbeats sent by the VMware HA agent on each host in the cluster. It will enable VMs to restart on another host, if a host failure occurs. It is also required by FT recovery process to work properly. Disabling will disable VMware HA.
Failure detection and Host network isolation
Q. When does a host declare itself as isolated?
A. If a host stops receiving heartbeats from all other hosts in the cluster for more than 12 seconds, it attempts to ping its isolation address and if that also fails, it declares itself as isolated from the network.
Q. When does other hosts in the cluster treat the isolated host as failed?
A. When the isolated host’s network connection is not restored for 15 seconds or longer, then the other hosts in the cluster treat the isolated host as failed and attempt to failover its VMs.
Q. What is isolation response?
A. It is the action that HA takes when the heartbeat network is isolated.
Q. What are different isolation response possibilities?
A. 3 of them. “Power off”, “Shut down” and “Leave powered on” and as of vSphere, the default is “Shut down”.
Q. When to use shutdown / power off / leave powered on options?
A. dfdfdfsfdfdfddfd – Shutdown option: VMs that have not shut down will take longer to fail over while the shutdown completes and VMs that have not shutdown in 300 seconds (5 minutes) are powered off and this 5 minutes value can be changed at das.isolationshutdowntimeout in seconds.
HA Admission Control
Q. What is Admission Control?
A. It is to ensure that sufficient resources are available in a cluster to provide failover protection and to ensure that VM resource reservations are respected.
Q. How to enable and disable Admission Control?
A. Enable: Do not power on VMs that violate availability constraints à Disable: Power on VMs that violate availability constraints – no warnings are presented and the cluster doesn’t turn red.
Q. What are different types of Admission Control?
A. Host – host has sufficient resources to satisfy the reservations of all VMs running on it, Resource Pool – sufficient resources to satisfy the reservations, shares and limits of all VMs associated with it and VMware HA – sufficient resources in the cluster are reserved for VM recovery in the event of host failure. VMware HA is the only type that can be disabled, but not the rest. Recommendation is not to disable. You might want to disable only during some maintenance or testing.
Q. How many host failures can a cluster tolerate admission control policy?
A. Default is 1 and the maximum is 4
1. Number of Hosts that can fail
Q. How does VMware HA performs admission control when number of hosts that are reserved for admission control policy?
A. Calculate the slot (logical representation of CPU/Memory) size à determine the number of slots each host in the cluster can hold à determine the Current Failover Capacity of the cluster (that is number of hosts that can fail and still leave enough slots to satisfy all of the powered-on VMs in the cluster) à determine whether the Current Failover Capacity is less than the Configured Failover Capacity (provided by the user) and if it is less, then the admission control disallows the operation. This policy avoids resource fragmentation by defining a slot as the maximum virtual machine reservation. This policy tolerates up to 4 hosts of failure. In heterogeneous cluster, this policy can be too conservative as it only considers the largest VM reservations when defining the slot size and assumes largest hosts fail when computing the Current Failover Capacity. When FT is used, the secondary VM is assigned a slot.
Q. How is slot size calculated?
A. CPU à obtain the CPU reservation of all the powered on VMs in the cluster and select the largest value and if no reservation specified for any VM, it will take this value as 256 MHz (this can be changed by changing das.vmcpuminmhz), Memory à memory reservation + memory over heard of each powered on VM and select the largest value and there is no default value for memory.
Q. How is Current Failover Capacity calculated?
A. Each host’s CPU and Memory that are contained in host’s root resource pool (not physical resources of the host) for only hosts that are connected (not the ones in maintenance mode, standby and that have VMware HA errors) à Max number of slots that each host can support = CPU/Memory resource amount / CPU/Memory slot size and the result is rounded down. Both CPU and Memory numbers are then compared and the smallest is the number of slots that the host can support. Current Failover Capacity is then calculated based on all the hosts that can fail and still leave enough slots to satisfy the requirements of all powered-on virtual machines.
Q. How does this affect the design?
A. If we design N+1, where the reservations are setup large, which will distort the slot size calculations and there aren’t enough slots for one host to fail, then the admission control is definitely going to fail when the host actually fails. Example: there are 4 hosts in a N+1 cluster with each a slot value of 16 = 64 slots in the cluster and it has 64 VMs running on it, then if one host fail, then there is no room to failover them all onto 3 hosts with only 48 slots. To avoid the distortion of slot size calculation, you can set an upper bound for the CPU and memory component of the slot size by using das.slotcpuinmhz and das.slotmeminmb attributes.
2. Percentage of resources that can fail
Q. How does VMware HA perform admission control when a % of resources reserved for admission control policy?
A. Calculate total resource requirements for all powered on VMs in the cluster à calculate total host resources available for VMs à calculate the Current CPU Failover Capacity and Current Memory Failover Capacity for the cluster à Determine if either the Current CPU Failover Capacity or Current Memory Failover Capacity is less than the Configured Failover Capacity (provided by the user) and if so the admission control disallows the operation. Again here it uses reservations (default 0MB and 256 MHz, if no user specific values are there). This policy doesn’t address the problem of resource fragmentation. This policy tolerates up to 50% of resource failover. In heterogeneous cluster, this policy will not be affected. When FT is used, the secondary VM’s resource usage is accounted.
Q. How is Current Failover Capacity determined?
A. (1) Sum the CPU reservations of the powered on VMs (default 256 MHz) and Sum of the memory reservation (default 0MB) à (2) Add host’s CPU and memory resources (root resource pool, not physical resources) à Current CPU Failover Capacity = [(2) – (1) / (2)] and same with Memory.
3. Specify a failover host for admission control policy
Q. How does this work?
A. HA will failover the VMs to a specific Host and if that host is not available or doesn’t have enough resources, it will restart the VMs on other hosts in the cluster. Status à Green – ready and no VMs on it, yellow – ready and VMs running on it, red – maintenance mode or HA errors. In this policy, resources are not fragmented because a single host is reserved for failover. This policy only allows a single failover host. In heterogeneous cluster, this policy will not be affected.
Q. How to choose an admission control policy?
A. It really depends on the factors such as (1) Avoiding Resource Fragmentation (2) Flexibility of failover resource reservation and (3) heterogeneity of cluster à see the answers in blue from the above three admission control policies.
Q. What are the requirements of VMware HA Cluster?
A. All hosts must be licensed à At least 2 hosts in a cluster à unique host name à static IP addresses or reservations used when using DHCP à all hosts access same management networks (at least one management network in common and best practice is to have two management networks in common) à ESX (Service Console) and ESXi (Management Network – VMKernel network checkbox) à All hosts should have access to same VM networks and datastores à VMs should be on shared storage, not local à VMwares tools installed for VM Monitoring to work à All hosts configured with DNS and if hosts are configured with IP Addresses, enable reverse DNS lookup (IP address should be resolvable to the short host name) à HA doesn’t support IPv6 à Each host name should be of 26 characters or less (including domain name and dots)
Q. Does VM Startup and Shutdown feature affect the HA or FT?
A. Yes, it is disabled by default and it is recommended to not enable manually, as this could interfere with the actions of cluster features such as HA and FT.
Virtual Machine Options
Q. What are the Virtual Machine Options?
A. (1) VM restart priority – (Disabled, Low, Medium (the default) and High) it is the relative order in which VMs are restarted after a host failure and they are restarted sequentially with high first, then normal and then low until all VMs are restarted or no more cluster resources are available Example: in a multi-tier application place database as high, application as medium and web server as low. Disabling restart priority for certain VMs that are redundant on other hosts (such as multiple Domain Controllers, DNS and so on)
(2) host isolation response – it determines what happens when a host in a VMware HA cluster loses its management network connections but continues to run. This can be customized for individual VMs.
VM and Application Monitoring
Q. How is VM Monitoring performed?
A. If regular heartbeats from the VMware Tools process are not received within the failure interval, then VM Monitoring service will verify I/O stats (for disk and network) level (to avoid any unnecessary resets) for about 120 seconds and if not VM will be reset. And default 120 seconds can be changed at das.iostatsinterval.
Q. How is Application Monitoring performed?
A. Either use an application that supports VMware application monitoring or obtain the appropriate SDK and use it to setup customized heartbeats for the application you want to monitor. After which, if the heartbeats are not received from the Application, the VM will be restarted.
Q. What kinds of sensitivity are available?
A. Highly sensitive monitoring – more rapid conclusion that a failure has occurred (failure interval: 30 seconds and reset period: 1 hour) à Low sensitive monitoring – longer interruption in service between actual failures and VMs being reset (failure interval: 120 seconds and reset period: 7 days) à Medium (failure interval: 60 seconds and reset period: 24 hours). During this reset period, the VMs will be reset for only 3 times.
Q. What are the various advanced attributes?
A. das.isolationaddressX – X = 1-10 isolation addresses, typically one for management network is good – HA should be re-enabled
das.usedefaultisolationaddress = specify whether to use default (mgmt network gateway) or not – HA should be re-enabled
das.failuredetectiontime = 15 seconds default – HA should be re-enabled
das.failuredetectioninterval = 1 second default – HA should be re-enabled
das.isolationshutdowntimeout = 300 seconds default and only applies for Shut down VM response – HA should be re-enabled.
das.slotmeminmb = max bound on the memory slot size.
das.slotcpuinmhz = max bound on the cpu slot size.
das.vmmemoryinmb = default memory resource value assigned to a VM if its memory reservation is not specified or zero (only for Host Failures policy)
das.vmcpuinmhz = default is 256MHz
das.iostatsinterval = changes the defaultI/O stats interval for VM Monitoring sensitivity – default is 120 seconds. 0 – disable and any value more than that can be setup to enable.
HA Best Practices
Q. What are the best practices for HA performance?
A. Networking configuration and redundancy à setting alarms to monitor cluster changes for notifying administrators à monitor cluster validity – admission control policy is not been violated, cluster becomes invalid (red) if current failover capacity is smaller than configured failover capacity (also overcommitted (yellow)) – DRS behavior is not affected if a cluster is red because of a VMware HA issue à check the operational status of the cluster – verify summary tab / cluster operational status screen.
Networking Best Practices for HA
Q. What are the best practices for HA during network configuration and maintenance?
A. When making changes to network architecture, it is recommended to suspend Host Monitoring feature, to avoid any heartbeat interruption à adding portgroups or removing vSwitches, it is recommended to suspend Host Monitoring and also place the host in maintenance mode.
Q. Which networks are used for HA communications?
A. On ESX, all networks that are designated as service console networks and VMKernel networks are not used by these hosts for HA communications à On ESXi, by default HA communications travel over VMKernel networks, except for those marked for use with VMotion; ESXi 4.0 and later, explicitly enable the Management network checkbox for VMware HA to use this network.
Q. How are the cluster-wide networks considered?
A. The first node added to the cluster dictates the networks that all subsequent hosts allowed into the cluster must also have. Any hosts with less or more networks added to the cluster will fail.
Q. How is the network isolation addressed considered?
A. Even though you have many management networks, by default only one default gateway will be specified and you should use das.isolationaddressX to add isolation addresses for additional networks. It is also recommended to change the das.failuredetectiontime value to 20000 milliseconds (20 seconds – but we are changing to 30 seconds as mentioned above), as a node that is isolated from the network needs time to release its VM’s VMFS locks if the host isolation response is to fail over the VMs (not to leave them powered on) and this must happen before the other nodes declare the node as failed, so that they can power on the VMs, without getting an error that the VMs are still locked by the isolated node.
Q. Any changes required on the physical switches?
A. Enable PortFast on the physical switches as this setting prevents a host from incorrectly determining that a network is isolated during the execution of lengthy spanning tree algorithm.
Q. What other networking considerations required?
A. Incoming ports: TCP/UDP 8042-8045 and Outgoing TCP/UDP 2050-2250 à Portgroup names and network labels should be consistent à NIC redundancy by NIC teaming or multiple management networks and make sure there aren’t multiple hops between servers in a cluster to avoid any network packet delays for heartbeats. 2 NICS – 2 Physical Switches can have two independent paths for sending and receiving heartbeats provide the cluster more resiliency.
Decisions to make when configuring HA: HA Summary
Q. What to look for when designing HA in your Organization?
A. Verify pre-requisites à Design proper Network Configuration à Primary node placement àHost Monitoring Status à Isolation Response decision (Shut down, Power Off and Leave Powered on) à HA and DRS/DPM together à HA Admission Control (Host Failures Cluster Tolerance policy, Percentage of Cluster Resources policy and Specify a Failover Host policy) à Virtual Machine Options àVM Monitoring
Source : Thanks to Virtualyzation.