NexentaConnectLogoIn this blog post, VDI Done Right! with NexentaConnect, I will dive deeper into VDI as a solution and how you can use NexentaConnect with local storage to achieve a great user experience. The most important part of a VDI project, in my opinion, is the User! I often see that the most fabulous technical environments are being build but without the User being the focus point. If you want your VDI project to succeed think about the users and the experience they will feel. Changing things in the user surroundings will make them uncomfortable at first, so make sure their first impression, and experience with the new digital workspace is a good one. Swapping a screen on a desk for a new (bigger) one is highly visible for a user and they experience it as getting a new workplace.

So how are you going to accomplish that with your VDI implementation?

Centralizing their PC as a vDesktop is most often not visible to a user. User experience is subjective, context-dependent and dynamic over time. Many factors can influence a user’s experience and judgment over VDI, like user’s current state (emotion), previous experience, system properties, and the usage situation.

Playing the VDI game

When playing a game you expect that the game give a unified experience and the speed is consistent every time when you play. Also your movement must be fluid and without errors. Reducing load times between levels increases the user experience. Maybe you even get a Turbo-boost button. VDI is not so different, the users expect the same speed or better than they had before. Also a unified response time regarding the desktop and applications on it is highly valued. If an application varies in response it impacts the user experience instantly. Also the usage situation can add to an increased user experience by following a user around, enabling any device, any place, anytime. Last but not least smoothness of the desktop and how it reacts like mouse movement, text appearing in a document, movies played are extreme important points for a great user experience and therefor contributing highly to acceptance of the VDI solution.

Speed_Latency_TurboBoost

User Experience

I have been involved in several VDI implementations the last 5+ years and key to a successful VDI rollout is the user experience and expectations being exceeded. High latency over extended periods of time kills the user experience and therefor the acceptance of the overall solution. Users just want their applications, data and communication channels. The sophisticated and high-end technology needed behind the delivery of applications, data and communication to any device 24/7 is of no interest to users. Expectation of users often consists of freedom of choice with regard to location, device and time. When it comes to measuring VDI user experience, the most important metric is clearly performance that is tightly coupled with smoothness. Remember that Turbo boost button?

So if we, as IT, can deliver a great user experience fast and simple, you will be the champion!  A great User Experience, in my opinion, consist of:

  • Fast and reliable platform;
  • Snappy/fluent response;
  • An easy to use platform;
  • Uniform presentation;
  • Consistency of response.

Why is storage so important to the VDI user experience?

Looking at the operations a user experience is linked, like opening a local file, starting up an application or powering up the desktop at boot time. It all is linked to the storage. So how fast is the storage, appointed to the vDesktop, measured in IOPS, throughput and latency? VDI deployments measure the number of IOPS of capacity required across all desktops in use. In addition, this workload is typically random I/O, which is more difficult to deliver with consistent performance. Capacity is less of an issue. VDI basically has the worst data pattern imaginable

80% write, 100% random at 4k sustained. To speed up the vDesktops it is a good practice to split the two data types:

  1. User data – capacity matters
  2. Virtual Desktop Operating System – performance matters

Latency

PyramidOfLatencyLatency is a time delay between the cause and the effect. So how fast is the required data tracked and being delivered to the vDesktop. Virtual desktop operating systems demand high I/O along with fast read and write bandwidth from storage. If we look at the components you can set them on a time scale how long it will take to access the data in latency terms:

  1. in memory, DDR3 RAM about 9ns;
  2. on SSD, average SSD is 0.03ms;
  3. on spinning disk HDD, average between 2-15ms;

If you look at the latency times, you will see that memory can deliver a tremendous amount of IOPS (Input/Output operations Per Second) because of the very low latency.

Persistent vs Non-persistent

In non-persistent VDI, or stateless VDI, all user configuration and user data is stored on separate hardware that is accessed remotely, for instance, as a network share via CIFS/SMB. The stateless approach has a number of advantages. By separating OS from user data, each can be managed separately for capacity and speed, reducing the risk of user data growth affecting all VDI deployments. It also enables both data types to be treated independently from a performance perspective. This provides the opportunity to place user data on lower-cost storage hardware. Finally, the stateless option results in fewer requirements for backup as only the virtual machine golden image(s) and user data need to be backed up. The recovery of stateless environments is therefore so much quicker.

The Comfort Zone

ComfortZone_MagicHappens Traditional VMware Horizon View Deployments are normally very basic. Some View connection servers connected to your Active Directory, a security server for remote access. A couple of ESXi servers combined into a VMware Cluster connected to a traditional SAN or NAS, where everything is sitting on shared storage.

A very traditional model, but some problems arise with this model.  Problems you will see with VDI is that the servers run into CPU limitations that result in slow desktop performance. You are overloading that SAN or NAS which causes increased latency and eventually the user experience goes down the drain fast! To fix that the storage cost will sky rocket.

 

How does Nexenta fix this?NConnectFlow

It really is pretty simple, as we have seen RAM has very low latency and can deliver a tremendous amount of IOPS with over 1.000.000+ IOPS and more than 12800MB/s throughput. Nexenta places an I/O Engine in the form of a Virtual Storage Appliance (VSA) on the ESXi server. The VSA will use the RAM in the server to create an Adaptive Replacement Cache (ARC) it can combine this with local storage and/or shared storage if needed. The ARC will dynamically balance between the most recently used (MRU) and most frequently used (MFU) data. This enables the use of high capacity NL-SAS for data as local disks without performance impact, so you can get rid of shared storage for the vDesktops.  If the ARC is almost as big as your master image you will have a very high cache hit ratio from 95%- up to even 99%+

OverviewNConnect

Under the hood

The IO Engine (VSA) is a virtualized NexentaStor, which is the flagship of  Nexenta Open Source-driven Software-Defined Storage (OpenSDS) platform. With NexentaStor you can evolve your storage infrastructure, increase flexibility and agility, simplify management and dramatically reduce costs without compromising on availability, reliability or functionality, but that will be another blog post. Now lets focus on NexentaConnect and how it works. I have done an install a few weeks ago for a health-care environment with 1500 users on 350 concurrent VDI’s. The customer uses a HP cluster with some HP360G6 servers and a newer primary cluster with Dell R610 servers with 96GB memory in them. Every local VSA per ESXi got 32GB of memory. Each ESXi server was expanded with two 600GB SAS disks 10K. The hardware was ready for the install this way.

So how does this work ? The VSA looks for unused disks and creates a vDEV of every disk, it depends on which VDI profile you choose tho, and how you configure the availability, capacity and or performance of the storage. On a vDEV a VMware VMFS datastore is being deployed. The vDEVS are pooled into a zPool and connected with the ARC. The VSA than makes a NFS Mount point in VMware and advertises that as the location where the VMware Horizon View Pool(s) will reside. In this use case there are stateless vDesktops deployed with Linked clone technology, about 40-50 per ESXi host. On a performance perspective the 10K disks have a latency of 3ms and combined they give 1.1TB capacity. By combining the two vDEV’s into a zPool with a software RAID-0 setup you get 240 raw IOPS  from disk. Now the magic happens when the ARC kicks in automatically. I have seen 95%+ cache hit ratio which resulted in more than 4800+ IOPS. With 50 desktops per ESXi server this gives you 96 IOPS per vDesktop. Because of the very low latency and massive throughput from memory the vDesktops feel so much more snappier which results in a great user experience. On the other hand by using inexpensive storage, the cost per User/vDesktop drops a lot.

ConfigureVDIProfileVSA

Creating a building block

You can for example create a building block with a Rack server,  e.g. a Dell R730,  with 384GB memory in it and fill it with 2x HDD for OS and Local VSA and 2x SDD for local storage to hold 150-175 linked clones of vDesktops. Your failure domain will be that building block and not the whole VDI domain. Also you can expand that building block with for instance Nvidia Grid/Quadro cards to accelerate graphics. So a use case can be AutoCAD, MicroStation, X-Ray, ESRI GIS or other applications that are demanding on Storage, CPU and Graphics. By using building blocks you can easily scale from a few hundred to thousands of vDesktops. Your pilot can be a single building block. From the rack server with 384GB you will take approximately 40GB memory (if golden image is near that size) to appoint to the the VSA so you will see close to 99% cache hit. Thereby reducing latency tremendous and increasing End User Experience up to 10 times. Also because of the reduced latency and the high performance storage, CPU cycles are freed up far often and more quickly so the density per host will go up +75%. So more vDesktops can be hold on a single server without a drawback on performance/user experience. This helps the TCO and VDI case within your organization. Also think of using TrendMicro anti-malware which is priced per CPU, so the price per vDesktop will drop.

BuildingBlock

NexentaConnect

NexentaConnect is designed to be simple not only in design and implementation but mostly to operate. This simplicity masks a sophisticated and powerful storage solution. It is a non-intrusive install, which makes it easy to do an evaluation in a current production environment. NexentaConnect fully automates the VDI deployments and provides an unlimited scalability for the environment of hundreds and thousands of virtual desktops.

With NexentaConnect for VMware Horizon you can:

  • Provision a new Desktop Pool and storage
  • Provision a multi-purpose GlobalVSA storage
  • Reconfigure an existing NexentaConnect Desktop Pool
  • Calibrate an existing Desktop Pool deployed using NexentaConnect to optimize resource utilization
  • Benchmark an existing NexentaConnect Desktop Pool

There are three versions of NexentaConnect:

NexentaConnect_HorizonNexentaConnect_VSANNexentaConnect_XenDesktop

NexentaConnect for VMware Horizon

NexentaConnect for VMware Horizon features complete GUI-based Wizard-driven automation and powerful I/O acceleration for a full range of storage options.

NexentaConnect for VMware Virtual SAN

File services for VMware Virtual SAN environments adding NFS and SMB access on top of existing Virtual SAN to complete the software hyper-convergence model. Update: Link no longer working

NexentaConnect for Citrix XenDesktop

Integrated with Citrix XenDesktop, NexentaConnect improves end user experience and desktop density to provide a reliable automated VDI infrastructure through automation and I/O acceleration.

Putting it all together

NexentaConnect gives you higher performance for your desktops. Also you will end up with higher density of the number of VMs on your ESXi hosts because the physical CPUs in the server can process storage needs so much faster, therefor freeing up time on the CPUs for applications running in the vDesktops. Also by using local resources on a server you will get lower costs per desktop, which makes the ROI of the VDI project much more appealing and easier to sell internally. If your applications demand even a better latency number you can also add SSDs as local storage and bring the latency to 0.3ms and IOPS way over 200K IOPS! and that with local storage. If you combine NexentaConnect for Horizon with NexentaStor it utilizes NAS VAAI to provide ZFS to Copy on Write clone files to deploy persistent VM images much faster while at the same time saving on the storage capacity. (about 5.4x faster to deploy full clones and 8.5x saving on storage capacity. The NAS VAAI utilizes enhanced Deduplication which results in a Dedup ratio of almost x23, saving tremendous storage capacity. (I will dive deeper into NexentaConnect for VMware Horizon View components in a different blog post)

So if you want to do VDI Right consider using NexentaConnect, because

Solve the problem of delivering I/O to the hypervisor from inside the hypervisor

 

Benefits of VSA-based designs

What are the benefits of a VSA based design?

  • Inexpensive Storage, so cost per User drops
  • High IOPS per User (30+), so user gets a great user experience
  • Burst Isolation, by tying the burst to a local ESX host, when a vDesktop bursts other vDesktops on other ESX hosts are unaffected
  • Low Latency, makes sure users have a snappy vDesktop which results in great user experience
  • Minimizes network Traffic, which lowers risk of bottlenecks
  • Uniform IOPS/User at any scale, eliminates declining user experience on ramp-up
  • Scalable Write-Cache, creates great user experience with fast writes at any scale
  • Failure domain much smaller (ESXi is the failure domain), so not one demanding workload which ruins it for everyone

ElasticeSDVDI

Tuning

If you look at the applications in use, Microsoft Office is one that pops-up very regular. Do not forget to switch the Accelerated Graphics settings to be switched from hardware to software rendering inside the vDesktop or you must be using NVidia accelerator cards. For some extra tuning of your golden images see another blog post named: How to improve VMware View video performance.

It will always be a balance between what users are asking for and the money you can spend on it. By using local resources and tieing it to a ESX server your failure domain will be only those users on that server. Also by building this up as Lego you can add Nvidia Quadro or GRID cards to even more increase the user-experience.

PoC

If you are running a VDI environment or are looking to run a VDI install and are looking at speed and/or cost, try out NexentaConnect for free. Do not take my word for it, try it! It is a non-intrusive install. If you need help with a PoC you can always contact Nexenta in your region to help you out. To download NexentaConnect follow this link.

Background on Time

Time and the way it operates in an IT environment is often confusing. See below for some background on Time: orders of magnitude in perspective.

On a scale you will see that:

1 second (s) = 1.000 milliseconds (ms) = 1.000.000 microseconds (µs) = 1.000.000.000 nanoseconds (ns)

Millisecond

1 millisecond (ms) – One thousandth of one second

Typical seek time for a computer HDD is 4-8 ms, set against a SSD which has an average of 0.03ms

Microsecond

1 microsecond (µs) – One millionth of one second

1 µs is the time to execute one machine cycle by an Intel 80186 microprocessor.

Nanosecond

1 nanosecond (ns) – One billionth of one second

1 ns is the time to execute one machine cycle by a 1 GHz microprocessor, where a DDR3-1600 memory module has a cycle time of 5ns and CAS latency of 10ns.