VMware vCloud Director design guidelines
After VMware vSphere and View, VMware vCloud Director is the next big thing to setup and customers start asking for it. But the problem is that the knowledge and available resources are limited. So for real life implementations of vCloud Director we have to rely on VMware employees to show us the ropes.
First of all, what is VMware vCloud Director. In short, VMware vCloud Director gives enterprise organizations the ability to build secure private clouds as a base for a infrastructure-as-a-service (IaaS) solution. Coupled with VMware vSphere, vCloud Director delivers cloud computing for existing datacenters by pooling virtual infrastructure resources and delivering them to users as catalog-based services.
The vCloud Director architecture is shown below.
The solution is composed of the following components:
- vCloud Director server instances or cells;
- vCloud Director database;
- one or more VMware vSphere environments composed of:
- one or more VMware ESXi hosts;
- one or more VMware vCenter servers;
- vCenter server database
- Clients interfaces:
- vCloud Director REST API;
- vCloud Director Web Interface.
This vCloud Director configuration enables the creation of vCloud Virtual Datacenters (vDC), which is simply a collection of resources such as networks, storage, CPU, and memory.
There are two kinds of vDCs:
- Provider vDC
A provider virtual datacenter (vDC) combines the compute and memory resources of a single vCenter Server resource pool with the storage resources of one or more datastores available to that resource pool.Multiple provider vDCs can be created for users in different geographic locations or business units, or for users with different performance requirements.
- Organization vDC
An organization virtual datacenter (vDC) provides resources to an organization and is partitioned from a provider vDC. Organization vDCs provide an environment where virtual systems can be stored, deployed, and operated. They also provide storage for virtual media, such as floppy disks and CD-ROMs.
In short, a provider vCD combines all its available vSphere virtual resources and partitions them into smaller entities called organization vCDs and publishes them to customers.
Now how do you design and size this? Here are some design guidelines.
First of all follow the sizing guidelines for the underlying VMware vSphere platform. Follow the guidelines and limits: VMware vCenter 4.0 Configuration Limits, VMware vCenter 4.1 Configuration Limits, VMware vCenter 4.1 Performance and Best Practice and Configuration Maximums for VMware vSphere 5.0.
Next up, the vCloud Director configuration.
The scalability of vCloud Director can be achieved by just adding more cells to the system. These instances or cells connect to the same central database which makes the vCloud Director database the ultimate place for a performance bottleneck. The issue here is the number of database connections. By default, each cell is configured to have 75 database connections. The number of database connections per cell can become the bottleneck if there are not sufficient database connections to serve the requests. When vCloud Director operations become slower, increasing the database connection number per cell might improve the performance.
So you should configure the database to accept a minimum of 75 connections per cell and when using Oracle, an additional 50 connections for internal use.
When using a MS SQL server you should also take some extra variables into account. vCloud Director uses the SQL Server tmpdb file when storing large result sets, sorting data, and managing data that is being concurrently read and modified. This file can grow significantly when
vCloud Director is experiencing heavy concurrent load. It is a best practice to put the tempdb database on disks that differ from those that are used by user databases and place it on a dedicated volume that has fast read and write performance.
Besides this there a few extra options regarding the tmpdb to ensure good MS SQL performance:
- Set the file growth increment to a reasonable size to avoid the tempdb database files from growing by too small a value. If the file growth is too small, compared to the amount of data that is being written to tempdb, tempdb may have to constantly expand. This will affect performance. You may have to adjust this percentage based on the speed of the I/O subsystem on which the tempdb files are located. To avoid potential latch time-outs, it is recommended to limit the autogrow operation to approximately two minutes. For example, if the I/O subsystem can initialize a file at 50 MB per second, the FILEGROWTH increment should be set to a maximum of 6 GB, regardless of the tempdb file size;
- Preallocate space for all tempdb files by setting the file size to a value large enough to accommodate the typical workload in the environment. This prevents tempdb from expanding too frequently, which can affect performance. The tempdb database should be set to autogrow, but this should be used to increase disk space for unplanned exceptions;
- Create as many files as needed to maximize disk bandwidth. Using multiple files reduces tempdb storage contention and yields significantly better scalability. However, do not create too many files because this can reduce performance and increase management overhead. As a general guideline, create one data file for each CPU on the server and then adjust the number of files up or down as necessary;
- Make each data file the same size, this allows for optimal proportional-fill performance.
VMware vCloud Director can be installed in a virtual machine and the number of cells/instances depends on the number of vCenter server instances. In general, VMware recommends the use of the following formula to determine the number of cell instances required:
number of cell instances = n+1 where n is the number of vCenter server instances
This formula is based on the considerations for VC Listener, cell failover, and cell maintenance. In ”Configuration
Limits,” it is recommended having a one-to-one mapping between the VC Listener and the vCloud Director cell. This ensures the resource consumptions for VC Listener are load balanced between cells. VMware also recommends having a spare cell to allow for cell failover. This allows for a level of high availability of the cell as a failure or routine maintenance.
If the vCenters are lightly loaded (that is, they are managing less than 2,000 VMs), it is acceptable to have multiple vCenters managed by a single vCloud Director cell. In this case, the sizing formula can be converted to the following:
number of cell instances = n/3000 + 1 where n is the number of expected powered on VMs
Each vCloud Director cell must be provisioned with at least 1 CPU and 1GB of memory, 2 CPU’s and 2GB is recommended and best practice according to VMware SE is 2 CPU’s and 4GB memory. Besides this each vCloud Director server requires approximately 950MB of free space for the installation and log files.
For provisioning of virtual machines vCloud Director 1.5 uses linked clones like we know them from VMware View. This enables vCloud Director to provide fast VM provisioning within and across datastore and vCenter boundaries. Compared with full clone, linked clone improves agility in the cloud by reducing provisioning time, providing near-instant provisioning of virtual machines in a cloud environment.
This sounds great but there are a few caveats. First of all Fast Provisioning is only supported in vSphere 5.0 environments. Mixed clusters of ESX/ESXi 4.x and ESXi 5.0 with vCenter 5.0 is not supported.
Second, linked clones is not the best solution for all workloads and for some instances it’s better to use classic full clones although this increases disk utilization is considerably slower for ‘default’ virtual machines, like shown in the graph below.
In an attempt to save space, linked clones use virtual disks that are sparsely allocated, called delta disks or redo logs. After the cloned virtual machines are created, powered on, and running, the delta disk grows in size. Sparse/delta disks in vSphere are implemented using a 512 byte block size, and require additional metadata to maintain these blocks. The advantage of using a small block size is that it eliminates copy on write overheads and internal fragmentation. However, this design tends to add some overhead in processing I/O generated by the linked clone.
Performance tuning tips:
- Use linked clones for virtual machines which are not generating I/O-intensive workloads;
- Use full clones for virtual machines generating I/O-intensive workloads. Linked clones adds additional I/O processing for delta disks which results in decreased throughput for workloads that exceed 1500 IOPS;
- In order to mitigate this problem, it’s recommended to shift the I/O load from the sparse disk to another thickly provisioned virtual disk within the virtual machine. This has the advantage of exploiting instant provisioning for the disks that contain the operating system, while taking advantage of improved performance of thickly allocated virtual disks for I/O-intensive applications.
vSphere 5.0 has a limitation on the VMFS file system, only eight hosts may have a disk open at one time. So, the virtual infrastructure can contain any number of virtual machines which share a common disk but the cluster size cannot exceed eight hosts. At the ESXi level only powered-on virtual machines matter, but this limit is enforced by vCenter server for powered-off virtual machines as well.
Performance tuning tips:
- When using fast provisioning (linked clones) and a VMFS datastore, do not exceed eight hosts in a cluster;
- For clusters larger than eight hosts that require fast provisioning (linked clones), use NFS datastores.