Virtualization is hot! A lot of companies are already virtualized or in the middle of a migration. The physical platform gets replaced by a virtual one. This directly introduces a new whipping boy.

In the past users and administrators looked at stuff that could cause the performance problems. These days it seems that it’s easier just to put the blame on the virtualization layer.

After a migration a customer complained that they virtualized a number of servers and now their applications were behaving slower than when they were running on physical hardware. One of the first things they said was: The virtual environment is slow. We want to put the system back on physical hardware.

Since I’m one of the folks that put that infrastructure there one of my first answers was: Impossible! (you know, I have a lot of confidence about my work/myself in these things :-) )

Ofcourse the performance can degrade if the server gets virtualized but that shouldn’t be the first ¬†thing you think about. Just like with all problems you should troubleshoot first, conclude later. In this case the order was a bit off.

We started to troubleshoot the problem. It quickly became apparent that there were a couple of things that didn’t add up.

  • Version of the software on the old system was different from the new one
  • one of the servers in the chain was on another geographical location
  • X-Windows was still running on one of the guests
  • the JVM’s (Java Virtual Machines) weren’t optimized
  • The tests performed on the old hardware didn’t give the same results from the database as the new one
  • Logging on takes forever
For all the issues above changes are proposed to solve the problems. We’re also working on a testing plan so that the performance tests on the virtual infrastructure can be repeated time after time after time. This includes all components in the chain, from virtual server to host to network and SAN switches to storage. It will also be compared to the physical hardware. There won’t be any room for debate after I’m done.. muahaha.
Ofcourse the virtual infrastructure has been checked , but nothing out of the ordinary was found, only missing patches.

Another customer also complained about performance issues. After going through the same cycle (it’s the virtual environment, it’s your unattended install of Windows, Service Pack 2 is the issue, etc).

After a couple (!) of months the database administrator found out during a SQL debugging session that there was a field missing in the database. Oracle creates its own ‘view’ when a field is missing. This caused the ¬†query to take minutes instead of seconds.

Moral of this story: Don’t blame the virtual infrastructure in advance, but troubleshoot the problem instead. And if it the virtual infrastructure.. Admit it :-)