vSphere 6 AvailabilityJust when we thought that availability cloud not get any better VMware has made a huge step with vSphere 6 Availability by improving Fault Tolerance, vSphere Replication, Data Protection and MSCS clustering support.

Fault Tolerance

For me Fault Tolerance was one of the best and most impressive improvements in prior vSphere versions. With Fault Tolerance you can very easily protect mission critical, high performance applications regardless of the operating system or running applications.

There was only a big downside, support was very limited. You could only run 1 vCPU virtual machines, limited disk types where supported, no hot-plug, etc.

With the release of vSphere 6 VMware has massively improved Fault Tolerance! You can now protect any workload that has up to 4 vCPUs and 64GB Memory that is not latency sensitive (eg. VOIP, High-Frequency trading). This greatly expands the use cases for FT to approximately 90% of workloads.

Fault Tolerance

In previous version VMware used a technique called Lockstepping. The new technology used by Fault Tolerance is called Fast Checkpointing and is basically a heavily modified version of an xvMotion that never ends and executes many more checkpoints (multiple/sec).

The traffic between hosts where primary and secondary virtual machines are running, called Fault Tolerance logging, is very bandwidth intensive and will use a dedicated 10Gbps NIC on each host. This isn’t required, but highly recommended as at a minimum, a Fault Tolerance protected virtual machine will use more. If Fault Tolerance doesn’t get the bandwidth it needs the impact is that the protected virtual machine will run slower.

Limits: either 8 vCPUs or 4 FT protected VMs per host – whichever limit is reached first

  • x2 VMs w/4 vCPU each (total 8 vCPUs).
  • x4 VMs w/2 vCPU each (total 8 vCPUs).
  • x4 VMs w/1 vCPU each (total 4 vCPUs).

There is VM/Application overhead to using FT and that will depend on a number of factors like the application, number of vCPUs, number of FT protected virtual machines on a host, Host processor type, etc. VMware will release a performance paper around launch that will get into more specifics, for now the recommendation to customers is to test out using FT and see if it works for their workloads/use cases.

High Availability (HA) enhancements

vSphere HA delivers the availability required by most applications running in virtual machines, independent of the OS and application running in it. It provides uniform, cost-effective failover protection against hardware and OS outages within a virtualized IT environment. It does this by monitoring vSphere hosts and virtual machines to detect hardware and guest OS failures. It restarts virtual machines on other vSphere hosts in the cluster without manual intervention when a server outage is detected, and it reduces application downtime by automatically restarting virtual machines upon detection of an OS failure.

With the growth in size and complexity of vSphere environments, the ability to prevent and recover from storage issues is more important than ever. vSphere HA now includes Virtual Machine Component Protection (VMCP), which provides enhanced protection from All Paths Down (APD) and Permanent Device Loss (PDL) conditions for block (FC, iSCSI, FCoE) and file storage (NFS).

Prior to vSphere 6.0, vSphere HA could not detect APD conditions and had limited ability to detect and remediate PDL conditions. When those conditions occurred, applications were impacted or unavailable longer and administrators had to help resolve the issue. vSphere VMCP detects APD and PDL conditions on connected storage, generates vCenter alarms, and automatically restarts impacted virtual machines on fully functional hosts. By doing this, it greatly improves the availability of virtual machines and applications without requiring more effort from administrators.

vSphere HA can now protect as many as 64 ESXi hosts and 6,000 virtual machines—up from 32 and 2,048— which greatly increases the scale of vSphere HA supported environments. It also is fully compatible with VMware Virtual Volumes, VMware vSphere Network I/O Control, IPv6, VMware NSXTM, and cross vCenter Server vSphere vMotion. vSphere HA can now be used in more and larger environments and with less concern for feature compatibility.

vSphere Replication (VR)

With Sphere 6 VMware also improved vSphere Replication.

Bandwidth reduction

First of all the amount of bandwidth required has been reduced. This is done by using compression, which can now be enabled when configuring replication for a virtual machine. (disabled by default). Also changes are compressed at the source and stay compressed until written to storage.

VR Bandwidth

This does cost some CPU cycles on source host (compress) and target storage host (decompress) but by using FastLZ compression libraries there’s a good balance between performance, compression, and limited overhead (CPU). The typical compression ratio is 1.7 to 1

Improved security and performance

Second of all VMware improved security and performance by isolating vSphere Replication traffic from other vSphere host traffic. At source, a NIC can be specified for replication traffic. Network I/O Control can be used to control replication bandwidth utilization. At the target, vSphere Replication Appliances can have multiple vmnics with separate IP addresses to separate incoming replication traffic, management traffic, and NFC traffic to target host(s).

VR Isolation

At the target, a NIC can be specified for incoming NFC traffic that will be written to storage. The user must, of course, set up the appropriate network configuration (vSwitches, VLANs, etc.) to separate traffic into isolated, controllable flows.

vSphere Data Protection (VDP)

vSphere Data Protection Advanced functionality is now part of vSphere Data Protection (VDP). VDP Advanced edition is no longer available, so customers get all of the features and benefits with VDP, which is included with vSphere 6 Essentials Plus Kit and higher editions.

VDP enables both local data protection and offsite disaster recovery. Backups are performed locally which can then be replicated offsite for disaster recovery. The solution is well-integrated into vSphere and vCenter Server and utilizes the vSphere APIs for Data Protection (VADP).

vSphere Data Protection

VADP includes changed block tracking (CBT). The first time a VM is backed up, it is a full (level 0) backup. Each subsequent (level 1) backup checks VADP for changed blocks and backs up only these changed blocks. Backing up only the changed blocks dramatically reduces back times, i.e., shorter backup window requirements, and reduces resource utilization.

vSphere Data Protection (VDP) is designed to protect VMware virtual machine environments up to approximately 800 virtual machines. Nearly all workloads running in a VM can be protected with VDP. There are a few exceptions such as virtual machines with physical raw device mappings (RDM), VMs with very high levels of I/O and VMs that may require app-level quiescensing not supported by VDP (example: Oracle database on Linux).

Scalability

VDP currently supports up to 20 VDP appliances per vCenter. VDP external proxies can also be deployed to accommodate varying backup solution topologies and business requirements. External proxies can reduce the amount of backup data sent across a network and enable support for up to 24 concurrent backups. A VDP appliance with no external proxies is limited to 8 concurrent backups.

Capacity

vSphere 6 VDP supports up to 8TB of deduplicated backup data capacity per VDP appliance. Each 8TB appliance can protect approximately 150-200 average sized VMs (50-60GB of data each) with a 3% change rate and a retention policy of 30 days. Results will of course vary in every environment based on VM sizes, the types of data contained in the VMs, data change rates, and retention policies.

Integration

VDP agents for SQL Server, Exchange, and SharePoint enable individual, application consistent database backup and recovery on virtual and physical machines. The agent for Exchange also provide granular level recovery (GLR) – restore of individual mailboxes. Using agents for these applications enable true app-consistent backup and recovery – for example, the SQL Server agent utilizes SQL Server’s virtual device interface (VDI). These agents also manage transaction logs (truncation, circular logs, etc.) and provide the option to enable multiple-stream backups. SQL Server cluster and Exchange Database Availability Group (DAG) configurations are supported.

MSCS Clustering

With vSphere 6 VMware further enhanced the availability features introduced with vSphere 5.5.

vSphere 6 now supports:

  • Windows 2012 R2 and SQL 2012 running both in failover cluster mode as well as utilizing AlwaysOn Availability Groups.
  • IPv6 support.
  • faster PVSCSI adapter when using MSCS.
  • vMotion support for MSCS virtual machines when using Windows 2008 and newer operating systems that are clustered across physical hosts using physical RDM’s.

This allows customers to run virtual machines running MSCS on a single host, allowing vMotion and DRS to place the MSCS virtual machines in the vSphere cluster depending on their needs.