Up until a week ago, I hadn’t had the pleasure to work a lot with vSphere. Now with a test environment at the customer site I was able to freely play a bit with vSphere. One of the features I was curious on trying was Fault Tolerance (FT). I just created a random VM and enabled FT on it which all went fine.

After FT was enabled on the VM we wanted to see how we could upgrade the ESX hosts in that cluster as the documentation states that FT only works on hosts with the same build number.

In our test cluster we have 4 ESX hosts of which we updated 2 with the latest patches. We assumed that a manual migrate from the FT machine and the shadow VM to the 2 updated hosts, would be enough to clear our path for upgrading the other 2 ESX hosts.  This however didn’t work, during one of the steps in the migration wizard you will get a warning that the build numbers aren’t the same and the “next” button is grayed out.

So if we could not perform a migrate this way, would that mean we have to remove FT from a VM before we can migrate the VM? Looks like you have to, at least I did not find another way of doing so. However there are 2 options you can use when removing FT.

You can use either the option to “Turn off fault tolerance” or “Disable fault tolerance”. As I didn’t know the difference between the 2 options I went looking for it on the internet and stumbled onto this article

When you choose to turn off FT on a VM the secondary VM will get removed and with it all historical data as well, also the DRS state on the VM is turned back to the default from the cluster. With the disable option the secondary VM will enter a powered off state and will be saved to enable FT again.

As mentioned in the article disable is the option you should choose when you are planning to use FT again, which is the case after the upgrade of the other hosts is done.

When you put FT in a disabled state you will get an alert for the VM warning you that the VM’s fault tolerance state is changed, this warning you can safely ignore. After the host upgrades where done I enabled FT again on the VM. Again there is an alert informing you the state of FT is changed, which should dissolve shortly after FT is fully enabled again.

Since this all was in a test cluster we could freely play around with the VM’s and the hosts. But as we are working on getting vSphere in the production environment, updating ESX hosts with FT VM’s (if we even decide to use FT) on them will have to get some extra attention in making the overall design.