Virtual Infrastructure best practices
[Updated: 8-11-2009 10:00]
Lately I keep receiving questions from colleagues regarding virtual infrastructure design using VMware products. So I decided to sum up the best practices I use when designing a new virtual infrastructure. Some of the best practices are based on numbers and calculations but others are pretty obvious. Nevertheless you would be surprised how many environments I’ve encounter were the most basic best practices have NOT been met.
So hereby my list of best practices on:
- Virtual machines.
If you have additions or new insights please reply.
Hardware Compatibility List (HCL)
First of all check all hardware against the VMware HCL (http://www.vmware.com/go/hcl). This is a very basic step which should be included in every implementation but is omitted on numerous occasions. You would be very surprised how many customers I encounter were the purchased hardware is not listed on the HCL. The most painful case was at a customer were we lost the deal because our price was €15.000,- too high. It turned out that the competition had offered two HP servers directly connected to NetApp storage but a direct connection was not supported according to the VMware HCL. The competition had to make amends and offer a complete fiber network, worth €30.000,- for free.
Before you start the ESX installation, remember to switch on Intel VT and XD bit in the BIOS before installation. If you switch Intel VT on after the VMware installation you can not run 64 bit operating systems unless you re-install vSphere again.
Another thing to remember before you start the VMware ESX installation is to disconnect the fiber, iSCSI or NFS storage to prevent the installation from reformating the existing VMFS datastores and losing all virtual machines.
If you’re using ESX ensure your disk partitioning is correct and ready for the future. We regularly get questions from colleagues on how to extend ESX partitions, like swap, because they hadn’t taken future needs into account. Therefor I recommend the following partitioning (this is updated for vSphere 4):
|/||ext3||5120MB||The root partition stores the ESX system. If this partition has no free space, the ESX host will most likely crash. It is very important to prevent this.|
|swap||1600MB||The swap partition is used to swap memory pages if there is no more physical memory for the service console. Keep in mind future needs like service console agents for back-up or monitoring.|
|/var||ext3||4096MB||The var partition stores most system logs. Creating a separate var partition provides dedicated log storage space (/var/log) while protecting the root partition from being filled by log files. Normally the var partition is part of the root partition.|
|/home||ext3||2048MB||The home partition is created to prevent the root partition from filling up. By default the home partition is part of the root partition. By creating a separate partition for it the root partition will be protected from filling up. Service console accounts (not vCenter) each get a separate home folder.|
|/opt||ext3||2048MB||The opt partition stores HA log files and is created to avoid filling the root root partition. By default the opt partition is part of the root partition.|
|/tmp||ext3||2048MB||The tmp partition is also created to prevent filling of the root partition. Tmp can be used to extract patches and stage patches. By default the tmp partition is part of the root partition. By creating a separate partition for it the root partition will be protected from filling up.|
|/boot||ext3||1100MB||The boot partition stores the files necessary to boot the service console. (*|
|vmkcore||150MB||The vmkcore partition temporarily stores log and error information in case of a VMkernel crash. (*|
* Automatically created by the installer but not displayed.
It is also recommended to rename the local datastore/VMFS partition during installation. By default the name is ‘Storage1’, to prevent mix ups rename it to [- Local Storage] or [Local Storage@].
Root SSH access
This is a returning item in many virtual infrastructure designs and a great way to p*ss off Anne Jan. If you enable SSH root access, every administrator who ever added a host to vCenter can access your ESX host using Putty or any other SSH utility and cause mayhem. Why would you do this and not give all of your employees a masterkey to all the doors in your building? Another downside to SSH root access is that in log files you can not differentiate between people, so you can’t find out who did what on your ESX host. In a Windows environment Enterprise or Domain Administrator roles are handled with care so why shouldn’t you act in a similar fashion with VMware ESX as your infrastructure foundation.
Instead of enabling SSH root access I would recommend to create local ESX user for those who need SSH access and have them switch user as soon as they’re logged on or have them use sudo. To prevent having to remember multiple passwords I strongly advice to use Active Directory integration.
Service console installs
Most of the time the reason to use ESX instead of ESXi is its ability to install all kinds of agents in the Service Console. VMware strongly recommends not installing any agents in the Service Console and nowadays there are other ways to handle this even when using ESXi. So as a best practice recommendation: Do not install software in the Service Console unless it is absolutely necessary.
A real life case: During a virtual infrastructure implementation last year I hadn’t taken the Navisphere agents into account during the design stage. When system specialists started building the virtual infrastructure they ran into problems connection the EMC storage to the ESX hosts. Customer storage administrators quickly suggested to install the Navisphere agents in the service console. Luckily the System Specialist reported to the Project Leader which blocked the install just in time. Because it was a deviation from the design an impact analysis was performed. The analysis report came out with a negative advice because a search came up with tons of problems regarding Service Console agent installs and Navisphere in particular. That combined with the negative advice from VMware and the fact that it was just needed to register the LUNs once, the Navisphere agent was not installed and the storage administrator had to register the LUNs manually.
After we delivered the virtual infrastructure the customer they installed the Navisphere agent anyway and that’s when the trouble started. Despite the claims from EMC that no problems had been reported to them and the customer could install the agent without issues, the installation resulted in all kinds of issues.
The correct time and time sync is essential in a virtual infrastructure because virtual machines can be ‘paused’ when no CPU cycles are needed. So configure a NTP time source in your network to synchronize all ESX hosts with. (Thanks Marius Redelinghuys for the input)
Because VMware licenses VMware ESX by socket it is recommended to use CPU’s with a high core density. This way your will get the bes performance for the lowest price.
Regarding the different vSphere flavors, Standard, Advanced, Enterprise, Enterprise plus, pick the version that best suits your needs.
Physical or virtual
If your virtual infrastructure is well designed and fully redundant there is no limitation why you shouldn’t run your vCenter on a virtual server, besides the number of ESX hosts. It is fully supported and by running vCenter on a virtual machine, you can profit from all benefits a virtual infrastructure can deliver. The only limitation is the number of ESX hosts you have to manage. In a large environment a physical vCenter server is recommended, high availability can be achieved by using vCenter Server heartbeat. The the sizing below to determine to go virtual or physical.
- less than 10 ESX host:
- virtual server;
- 1 vCPU;
- 3GB of memory;
- Windows 32 or 64 bit operating system.
- between 10 and 50 ESX hosts:
- virtual server;
- 2 vCPUs;
- 4GB of memory;
- Windows 32 or 64 bit operating system (64 bit preferred).
- between 50 and 200 ESX hosts:
- Physical or virtual server (virtual preferred);
- 4 vCPUs;
- 4GB of memory;
- Windows 32 or 64 bit operating system (64 bit preferred).
- more than 200 ESX host:
- Physical server;
- 4 vCPUs;
- 8GB of memory;
- Windows 64Bit operating system.
DRS and HA
DRS and HA are two techniques which need to be addressed when running your vCenter Server on a virtual machine. First of all it is recommended to exclude the vCenter Server from DRS. It’s not that DRS is not supported or that it has negative impact on performance but this way you always know on which ESX hosts your vCenter Server is running.
HA is the technique which makes sure the vCenter Server virtual machine restarts in case of hardware failure. Because vCenter Server is the primary management interface for your virtual infrastructure it is important to get this up and running as soon as possible so configure your virtual vCenter Server with restart priority high. Because vCenter Server is dependent on several supporting services like Active Directory, DNS and SQL, make sure these services are online at the same time or before vCenter Server is.
vCenter Server is dependent on several services like Active Directory, DNS and SQL. It is required to have these services up and running together with vCenter Server or minimize the dependencies.
How do you ensure that SQL is online before the vCenter Server service is started? If vCenter and SQL are running on the same (virtual) server, configure a dependency on your SQL service in the vCenter Server service.
How to minimize dependencies? If you’re running a fully virtual environment with no supporting physical servers and you need to boot ESX before you can start a DNS server, you can minimize the DNS dependency by configuring a host file on every ESX host. Because this is a manual action which requires additional maintenance my advice is to only implement this in smaller environments were 100% virtualization can be achieved.
Because of security purposes I prefer to install VMware Update Manager on a separate virtual machine. I simply do not like the primary management platform to have internet access.
As your cluster is the boundary for the DRS and HA configuration this is an important design decision. Why?
First of all, in vSphere 4 the size of a HA cluster is limited to a maximum of 32 hosts and 1280 virtual machines.
Second, depending on the number of hosts in a cluster there is a maximum of supported virtual machine per ESX host.
- 100 virtual machines/host if there are eight or less ESX hosts in a cluster;
- 40 virtual machines/host if there are more than eight ESX hosts in a cluster.
Third, a VMware HA Cluster consists of primary and secondary nodes. Primary nodes hold cluster settings and all ‘node states’ which are synchronized between primaries. The first five hosts that join the VMware HA cluster are automatically selected as primary nodes, all the others are automatically selected as secondary nodes. The primary nodes are responsible for the HA failover process. Duncan Epping wrote a great section on his blog: Yellow-Bricks.com.
All three combined results in the following best practice: Take your hardware into account when designing VMware clusters.
For instance, when using blades it is important not to place all primary hosts in the same blade enclosure because if all primary hosts fail simultaneously no HA initiated restart of the VMs will take place. HA needs at least one primary host to restart VMs. This results in a cluster of max eight ESX hosts divided between two blade enclosures, four ESX hosts in each enclosure. This way you create a cluster of eight ESX hosts which can hold 100 virtual machine per ESX host which totals 700 virtual machines with a failover capacity of one host.
Spindles and RAID levels
With regards to storage spindles are key, more spindles equals more performance. The second item dictating performance is the RAID level. RAID levels are very important when designing a virtual infrastructure. When configuring storage it’s a compromise between capacity, performance and availability. These choices can make or break storage performance. Slower SATA disks in RAID10 can outperform faster SAS disks in RAID5. So the bottom line is, make sure your VMFS storage gets the best performance and all other storage gets the performance, availability and capacity it needs. So know your I/O characteristics.
Number of VMs/LUN
You’ll be surprised how many virtual infrastructure I encounter with only one extremely BIG LUN which contains all virtual machines. Most of the time, with this config, the end user is not satisfied with the performance. When I talk to them and propose to chop up their big LUNs into several smaller ones to improve performance, most of the time the reaction is one of disbelief. When I give them one smaller LUN and let them put a poor performing virtual machine on it, the discussion is over 9 out of 10 times.
This is the reason VMware best practices advices not to put more than 16 to 20 server VMs or 30 to 40 desktop VMs on a LUN. Personally I like to keep the lower values, so a maximum of 16 server VMs per LUN.
When limiting your design to 16 server VMs per LUN and obey the other VMware best practices like space for snapshots, clones and +/-20% free space the recommended LUN size is between 400 and 600 GB.
VMDK or RDM
When designing a virtual infrastructure and determining LUN size it’s a waste to fill a datastore with one virtual machine. In almost every design I keep to the following personal best practice: For every virtual machine disk larger than 20 to 50GB use a Raw Device Mapping (RDM).
In the past there have been discussions stating that RDMs have better performance but tests from VMware show that the performance difference is minimal and can be neglected.
Another reason to use RDMs over VMDK disks is the level of low level disk access/control and the need for SAN based features like snapshots, deduplication, etc. There are two compatibility modes, physical or virtual. The level of virtualization an application allows and the functional needs determine the compatibility mode. For instance in physical compatibility mode it’s not possible to use VMware snapshotting.
The VMFS block size can be set when formatting the LUN. The blocks size determines the maximum size of files which can be created on the VMFS storage.
Below the block sizes and the related maximum file sizes:
|Block size||Maximum file size|
|1 MB||256 GB|
|2 MB||512 GB|
|4 MB||1024 GB|
|8 MB||2048 GB|
When using 400 to 600 GB LUNs and assigning RDMs for virtual disks over 20 to 50 GB you can suffice with a 1MB block size because disk file will never exceed 256 GB.
Smaller block sizes also complement thin provisioned disks because the thin provisioned disks will grow with block size increments. But in contrast, a larger block size results in less SCSI locks. So set the block size based on the desired performance, maximum file size and disk strategy, I usually go with a 1 MB block size and never experienced a negative performance impact due to excessive SCSI locking.
A situation where I did experience a negative performance impact is when using thin-on-thin. So do not use thin provisioned virtual disks on thin provisioned LUNs.
Create an ISO store
For daily maintenance and use of a virtual infrastructure it is very convenient to create a central ISO store where you store image of all CDs and DVDs used. This way it’s very easy to mount an image to use in your virtual machine and it reduces the risk of version sprawl in your virtual infrastructure.
Disk alignment is something which can have substancial negative impact on performance. This goes for both VMFS partitions and vmdk disk files.
When creating a VMFS partition using the Virtual Infrastructure client the alignment is automatically set correctly.
Disk aligment in vmdk disk files is a bit more complex. If you want to perform a manual alignment of the file system in the vmdk disk file, check out this VMware document but I warn you it’s a very lengthy process.
It is much easier to use a tool which does all this work for you. Vizioncore has a great freeware tool called vOptimizer WasteFinder which scans through VMware vCenter Servers to locate over allocated virtual storage and misaligned virtual machines. Improperly aligned VMs experience decreased I/O throughput and higher latency. Optionally, vOptimizer Pro from Vizioncore can be purchased to quickly and easily reclaim wasted virtual storage and to align VMs to proper 64K partition boundaries. The freeware version includes two free alignment tasks.
Separate management, storage and VM traffic
To secure your virtual environment it’simportant to separate your virtual machine- from your management traffic. Besides that you need to ensure the desired 1 Gb bandwidth for your VMotion traffic.
Regarding IP storage it is important to create a separate network for IP storage traffic because this is a whole different data characteristic and IP storage network components require high performance network hardware. IP storage switches require very few ports per processor, preferably one on one, and require a fast backplane.
How to realize this? Best practice is to create separate vSwitches for virtual machine-, storage- and management traffic. Typically I use vSwitch0 for Management and VMotion, vSwitch 1 for IP storage and vSwitch 3 for virtual machine networks. When using VMware FT a fourth vSwitch is required to create a dedicated FT network.
I know there are people out there (and I even have colleagues who preach this) who put all traffic/portgroups on one vSwitch combining management, storage, VMotion and virtual machine traffic and claim they can present dedicated bandwidth and a secure connection to VMotion, FT, management and IP storage using QoS and VLAN tagging on the network layer. In my opinion this creates a chaotic situation where I do not have control over my network links and it’s not clear what they are used for. And if it’s not clear to me how should a virtual infrastructure administrator be able to understand this. VLAN tagging on different portgroups, Ok, but one vSwitch with 5-10 physical uplinks? Good luck troubleshooting this. Network designs like the one on the right are complex enough as it is.
My best practice: Keep it simple and clear! Use separate vSwitches for different ‘roles’ and separate Management from IP storage and virtual machine traffic.
Fully redundant networking
To ensure a high available infrastructure it is very important to design a rock solid network infrastructure. How? Every vSwitch has a minimum of two physical network interfaces which are connected to separate switches. So, in case a network adapter or a switch fails the virtual infrastructure keeps running, however with less capacity. But it’s always better to have somewhat slower infrastructure then no infrastructure at all.
Keep in mind that multi port network interfaces presents itself as separate network interfaces in VMware ESX. So a 4x 1Gb nic shows up in VMware ESX as 4 separate network adapters. Divide your physical uplinks over separate multi port network interfaces to ensure that when a complete multi port network adapter fails not all connections are lost.
Avoid Ether Channels/Link Aggregation
The use of ether channels/link aggregation only makes sense when there are virtual machines which require more than 1 Gb bandwidth, this is rarely the case. Besides that it is really difficult to configure and it has many dependencies. Most of the time network links are divided between switches for redundancy and it’s not possible or very very difficult to configure link aggregation accros multiple switches. So hands of ether channels/link aggregation and manage network load balancing using the standard VMware ESX policies (based on originating port ID, MAC address hash, etc)
Link speed and duplex setting
Many inexplicable network problems are caused by wrong network settings. Collisions, retransmits, slow links, these can all be caused by mismatched settings between network adapter and network switch. This usually happens when using different brand network adapters and switches. I’ve even come across slow performing network links which were caused by auto/auto settings on the network interface and switch.
There is however a downside to changing this setting and it is related to Distributed Power Management (DPM). DPM depends on Wake on LAN (WoL) and some network adapters support wake-on-LAN only at 10 or 100 Mb, not at 1 Gb. If such a network adapter is connected to a switch that supports 1 Gb (or higher), it will attempt to negotiate down to 100 Mb when the machine powers off. If the switch and network adapter are manually set to 1000 Mb/Full duplex the network adapter loses its connection to the switch when the machine powers off and wake-on-LAN fails to bring the ESX host back online when needed. Nowadays vSphere supports more techniques to wake up the host, besides WoL, so this shouldn’t be a huge issue. (Thanks Marius Redelinghuys for the input)
Best practices is to always set the speed and duplex settings manually to 1000MB/Full duplex on both ends, switch and network adapter if the network adapter allows this and test if DPM functions correctly. If not try another technique to get the ESX host out of standby mode. If that doesn’t work you will have to do with the auto/auto setting on just a few of the network adapter.
When combining LA N and DMZ on ESX hosts you’re heading for a conflict with the network administrators. I found it very difficult to convince them that VMware ESX is very secure and it’s no problem to combine LAN and DMZ.
A combination of VMware security whitepapers including the famous NSA report and the promise of a physically separated network does the trick 8 out of 10 times.
So, create a separate vSwitch for the DMZ network with minimal two physical network adapters preferably connected to different switches.
The risk involved in combining LAN and DMZ is human error. You need to inform the virtual infrastructure admins on the risks and check DMZ connections frequently.
Remove unused hardware
When creating a virtual machine or template make sure the virtual machine has no unused, obsolete hardware. So remove floppy drives, serial and parallel ports when you don’t need it. It’s just like with physical hardware, devices use IRQs and need to be polled while using resources in the process and slowing down the system. With virtual machines the principal is the same but now there a lot of virtual machines consolidated on the same hardware. Imaging the extra unnecessary load on the ESX host which can be used for more important processes. The performance gain won’t be huge but every little bit helps and maybe you can squeeze in an extra virtual machine when tuning your virtual infrastructure to the max.
Here again the principal is the same as with physical servers, disable unused services to achieve optimal system performance. In a virtual environment this is even more important because you can save resources on all virtual machines running on an ESX host.
The best way to achieve this in a Windows environment, is to create an OU structure in Active Directory based on server roles and create policies which disable unused services. This way you do not have to configure every server separately and the disabled services can be easily managed centrally. In my 12 year IT career I’ve come across one customer who had this in place and running perfectly.
Start with minimum resources
In physical environment we are used to size servers based on peak usage. With virtual machine it is recommended to start with a minimum amount of resources. Why?
First of all it is very easy to add resources at a later time if this turns out to be needed.
Second, with assigning resources there’s also a reservation and the reservations must be available for the virtual machine to start. So, adding too much resources will result in higher reservations which will result in less virtual machines on an ESX host.
Third, assigning more resources will not always mean that the virtual machine will perform faster. A real life scenario: at a customer site a colleague had installed an Exchange 2007 server on ESX 3.5 and assigned 4 vCPUs and a lot of memory and despite the huge amount of resources the server didn’t perform well (understatement of the century). After removing two vCPUs the machine came to life and after removing another vCPU the server was racing. So the Exchange server performed much much better with less vCPUs. The VMware ESX CPU scheduling was holding the virtual machine down.
It’s difficult so determine what is just enough and what is way too much. Most of the time I base this decision on the Capacity Planner information I gather at the start of the project. (Thanks Marius Redelinghuys for the input)