W H I T E P A P E R How to End Virtualization Administration Storage Frustration Marc Staimer, President & CDS of Dragon Slayer Consulting WHITE PAPER • How to End Virtualization Administration Storage Frustration How to End Virtualization Administration Storage Frustration Provisioning VMs are easy, and then the problems start… Marc Staimer, President & CDS of Dragon Slayer Consulting marcstaimer@me.com 503-579-3763 Introduction Virtualization has been an incredible IT operations godsend to the vast majority of IT organizations. It has tremendously simplified server implementations, management, operations, upgrades, patches, tech refresh, and most importantly of all, availability. Tasks that used to require scheduled downtime are now conducted online, during production hours, instead of weekends, holidays, and late nights. Virtualization administrators can create, configure, and provision VMs in a matter of minutes. This makes virtualization admins smile. It is what follows VM provisioning that makes virtualization admins scream in frustration. First there is the annoying wait for the storage provisioning that can be hours, days, even weeks. Then there is the exasperating intermittently poor performance that somehow has something to do with storage and rarely matches the pristine performance experienced in the lab. Diagnosing as well as fixing the problem is maddeningly difficult. The tools rarely say, “Here’s the root cause of the problem!” In addition, there’s the tedious ongoing manually labor-intensive performance tuning required for virtualization’s always changing fluid environment. And finally (as if that wasn’t enough), virtualization admins have to coordinate their data protection policies and practices with those of the backup admin and storage admin. This leads many virtualization admins to lament: “why can’t managing performance, storage, and data protection be as easy as creating, configuring, provisioning, and managing VMs?” It is a valid question. The root cause almost always comes down to storage. More specifically, it seems to revolve around storage based performance barriers. Storage performance barriers are technical issues or processes that prevent virtual machines (VM) and virtual desktops (VDI) from achieving required response times consistently. These barriers can be something as simple as the high latency inherent in hard disk drives (HDD) that limits IOPS and throughput, or as convoluted as SAN or LUN oversubscription. Each barrier by itself will reduce virtualization performance. Combinations of these barriers can decimate it. This document will examine in detail the storage problems virtualization administrators must deal with on a day-to-day basis; the typical market workarounds deployed to solve those problems; how those workarounds ultimately break down and fail in one way or another; and finally a better way to solve these vexing storage problems for the virtualization administrator – the Astute Networks ViSX Flash-based VM Storage Appliances. Dragon Slayer Consulting © 2012 All Rights Reserved • Q3 2012 2 WHITE PAPER • How to End Virtualization Administration Storage Frustration Table of Contents Introduction ........................................................................................................................ 2 Storage Barriers Plaguing Virtualization Performance ....................................................... 4  HDD Limitations ................................................................................................................... 4  Adding More HDDs to the Storage System ................................................................ 4  HDD Short Stroking .................................................................................................... 5  PCIe Flash SSDs as Cache in the Physical Virtualization Servers ................................ 5  Put into the Storage System Flash SSDs as Cache, Tier 0 Storage, or as Complete HDD Replacement .............................................................................................................. 6  Flash Cache Appliances in the Storage Network ........................................................ 6  VM LUN Oversubscription ................................................................................................... 6  Storage Network Configuration and Oversubscription ....................................................... 7  Organizational Administrative Silos ..................................................................................... 7  Data Protection, Business Continuity, and Disaster Recovery ............................................ 8  There Must Be a Better Way to Eliminate or Mitigate Storage Barriers Plaguing Virtualization Performance .......................................................................................................... 8 Astute Networks ViSX G4 Virtualization Optimized Storage Appliances............................ 8      No HDD 100% Flash SSD Appliances .................................................................................... 9 Unprecedented IOPS and Throughput ................................................................................ 9 Lower Than Expected TCO ................................................................................................... 9 Purpose Built for the Virtualization Administrator.............................................................. 9 Enterprise Class Reliability ................................................................................................. 10 Conclusion ......................................................................................................................... 10 Dragon Slayer Consulting © 2012 All Rights Reserved • Q3 2012 3 WHITE PAPER • How to End Virtualization Administration Storage Frustration Storage Barriers Plaguing Virtualization Performance There are five major storage based virtualization performance barriers: 1. 2. 3. 4. 5.  Hard disk drive (HDD) limitations VM LUN oversubscription Storage network configuration/oversubscription Organizational administration silos Data protection, Business Continuity, and Disaster Recovery HDD Limitations HDDs are electro-mechanical devices with spinning magnetic platters or disks and a moving read/write head. Performance is directly tied to how the density of the platters (the largest being 4TB but only 900GB for higher speed HDDs); the speed they spin (currently maxed out at 15,000 RPM); how fast the head can find a specific piece of data (commonly measured as seek time); how fast the head can write the data; and what the delay when a request comes in while the head is already reading or writing. All of these factors add up to the HDD latency. HDD latencies are high and by definition severely limit performance. One of the biggest contributors to IO latency is the huge well-known performance gap between storage processors and HDDs that’s rapidly widening. Processors have been following Moore’s law for nearly 40 years. This has led to their power, IOPS, and bandwidth improving roughly 100 X over a 10-year period (per Intel). HDD improvement over that same time period has been effectively flat. In other words, processors wait an eternity for HDDs to read or write data. Fig 1: Processor – HDD Performance Gap per Intel There are several workarounds virtualization admins attempt to overcome HDD limitations that include:      Adding more HDDs to the storage system HDD short stroking Add PCIe-connected Flash SSDs as cache in the physical virtualization servers Put Flash SSDs into the storage system as cache, Tier 0 storage or as complete HDD replacement Flash Cache appliances in the storage network Each of these workarounds manages to ameliorate a portion of the HDD performance issues but do not necessarily improve the life of the virtualization admin. A look at each one shows why.  Adding More HDDs to the Storage System Adding more HDDs is typically the first thing a storage administrator will do to try to improve virtualization storage performance because it’s the easiest. There is a noticeable and ultimately limited aggregate IOPS increase. Conversely, it takes a large number of drives to even partially close the Dragon Slayer Consulting © 2012 All Rights Reserved • Q3 2012 4 WHITE PAPER • How to End Virtualization Administration Storage Frustration processor to HDD gap. Storage systems HDD support is finite, which means when that limit is reached and performance is still not enough, either more storage systems must be added (causing storage system sprawl) or the storage system must be replaced with a bigger more expensive system. Both alternatives lead to problematic data migration and/or load balancing. More HDDs additionally mean higher capital expenses (CapEx) and much higher operating expenses (OpEx) in power, cooling, rack, floor space, and software licensing costs. Fig 2: Lots of HDDs Just because the workaround is easy does not mean it is either effective or efficient with results that are marginal at best.  HDD Short Stroking HDD short stroking is another regularly utilized workaround storage admins try to improve virtualization performance. HDD short stroking reduces latency by restricting placement of data to the outer sectors of platters that results in faster seek times. The outer sectors deliver the lowest latencies because the head doesn’t have to move as far when reading and writing data. Industry testing consistently shows HDD short stroking performance improvements ranging from 29 to 33%. This workaround too, has non-trivial drawbacks. It starts by throwing away as much as 67 to 90% of the usable HDD Fig 3: HDD Short Stroking capacity. And this wasted space still requires power, cooling and rack/floor space overhead increasing the OpEx cost per usable TB in addition to the extra networking infrastructure that will be required resulting from additional storage systems because that HDD limit is achieved so much sooner. The cost and complexity of HDD short stroking makes it an interim solution at best.  PCIe Flash SSDs as Cache in the Physical Virtualization Servers Fig 4: PCIe PCIe Flash SSDs in application servers with caching software that moves the Flash SSD data to SAN storage have become quite fashionable. They’re most common in high performance compute clusters (HPC). Having Flash SSDs in the physical server on the PCIe bus puts it very close to the application. This provides the lowest round trip latency between the application and it’s very high performance storage. This approach would appear to solve several of the virtualization administrators’ problems with control and performance. Regrettably, it does not solve all of them and creates several more including:       Places a heavy burden on the server’s CPU cycle utilization ranging as high as 20%. Virtualization oversubscribes physical server hardware to enable more virtual machines (VMs) or desktops (VDs) to utilize the hardware effectively. Taking a big chunk of the resources away dramatically reduces that consolidation capability. Caching software increases the CPU burden and further reduces each server’s ability to consolidate. That caching software is essential to hosted VM guests because it must keep the VM image synchronized with the shared storage sitting across the storage network or much of the virtual server’s advanced functionality including VM movement between machines, ceases to work properly. The caching software is not inexpensive and licensed per server. The PCIe Flash SSD hardware and caching software does not help the virtualization administrator at all with the management, provisioning, or data protection of the shared external storage. It still requires a knowledgeable storage and SAN administrator. This workaround is an expensive non-shared solution that only benefits the VMs or VDs on that particular physical server on which they reside. Other virtualized servers and desktops are out of luck unless they too purchase and install the same. Each implementation of the Flash PCIe SSD and its accompanying caching software must be implemented, configured, and ongoingly managed separately. Local caching doesn’t alleviate storage network requirements or issues. Dragon Slayer Consulting © 2012 All Rights Reserved • Q3 2012 5 WHITE PAPER • How to End Virtualization Administration Storage Frustration  Put into the Storage System Flash SSDs as Cache, Tier 0 Storage, or as Complete HDD Replacement Fig 5: Flash SSD Putting Flash SSD into the storage system will greatly speed up read IOPS. The Tier 0 Storage and complete HDD replacement will also increase write performance. This approach should definitely solve HDD limitations for virtualization; however, once again there are significant downsides. When using Flash SSDs as cache or Tier 0 (a.k.a. the hybrid storage approach), many storage systems have severe limits on the number of SSDs that can be supported. This limits cache or Tier 0 size. Smaller cache means that as data sets continue to grow there will be more cache misses that are redirected to HDDs and subsequently much lower performance. Smaller Tier 0 sizes means there will be reduced capabilities in supporting larger data sets for the target resulting in more active data residing on much lower performance Tier 1 or Tier 2 storage (HDDs). And the Tier 0 approach requires typically quite costly storage tiering software. Fig 6: Storage System w/several SSDs The storage systems with 100% Flash SSDs would seem to alleviate the problems of caching and tiering. Yet they too have issues and problems for the virtualization administrator. The storage systems will have a significantly higher upfront CapEx (TCO too) than hybrid systems and they will still require a storage knowledgeable storage administrator. These systems will not be under the control of the virtualization administrator and will do nothing Fig 7: Storage System w/100% SSDs to alleviate the storage as well as storage networking issues of setup, provisioning, management, data protection, and ops. Furthermore, these systems tend to move the performance bottleneck to the storage system controller or the storage network.  Flash Cache Appliances in the Storage Network The Flash cache appliance is similar to putting the Flash cache in the storage system but for that it sits between the virtualization initiators and the storage system target. This conceptually allows the Flash cache appliance to provide caching for Fig 8: Flash Cache Appliance multiple storage systems. The reality is a bit different. Most Flash cache appliances are capacity constrained. As datasets continue to grow, so do cache misses causing a redirect to the backend storage systems and HDDs reducing the Flash cache appliance’s effectiveness. By sitting between the virtualization initiators and the target storage systems, it introduces another storage management layer, another variable in troubleshooting, and another system to tech refresh. Virtualization admins are seeking simpler control over storage not more complexity especially when troubleshooting performance issues.  VM LUN Oversubscription One of the bigger complaints from virtualization administrators comes when they move a virtualization environment from the lab to production. Far too often they see a noticeable drop-off in performance. What’s even worse is that the performance drop is intermittent and unpredictably mystifying. Troubleshooting is an exercise in frustration. The issue is a situation that is frequently attributed to VM LUN oversubscription. VM LUN oversubscription is an indirect consequence of the virtualization hypervisor virtualizing storage LUNs. That virtualization enables the virtualization administrator to slice and dice a physical LUN into multiple virtual LUNs enabling each VM to have what they perceive as their own LUN. Dragon Slayer Consulting © 2012 All Rights Reserved • Q3 2012 6 WHITE PAPER • How to End Virtualization Administration Storage Frustration The storage system does not see or understand that there are different multiple virtual machines accessing that LUN. It sees a data stream coming from the same physical server. It does not parse that data stream. It cannot provide higher levels of service or priority of one VM over another when accessing that LUN. It can’t even see the different VMs. This means there can be multiple VMs attempting to hit the same HDDs at the same time. That creates contention on the drives. LUN IO is handled on a first in first out basis (FIFO). VM read/write IO requests are put in a queue. There are HDD queues and system queues. HDD drive type (SAS/FC or SATA) determines the number of queue IO requests. SATA drives have at most 32 buffered queues where as SAS/FC have 256. When queue buffers are saturated the VM SCSI protocol times out and bad things will happen (crashed VMs.) VMware has a quality of service (QoS) feature work around that allows the VM administrator to prioritize different VMs based on IOPS or latency. Unfortunately, it robs Peter to pay Paul. It takes performance away from the lower prioritized VMs then gives to the higher ones. It has limited usefulness because it only treats the symptoms of the problem and not the root cause. Fig 9: LUN Oversubscription Another work around is to limit the number of VMs that can access a LUN. This is a manual work around that must be set up in the beginning. Obviously, if each VM has its own LUN there is no contention. The third workaround is to implement a storage solution that utilizes Flash SSD as previously discussed. All of these workarounds require cooperation with the storage admin and the SAN admin as well as the facilities admin. Provisioning and changes must be scheduled in advance leaving the virtualization admin frustrated with partial solutions.  Storage Network Configuration and Oversubscription Storage networks (SANs) are architected for oversubscription. SANs allow multiple physical servers to access the same target storage ports. This enabled more servers to utilize the same storage resources long before server virtualization became fashionable. SAN oversubscription is today a common practice. But when virtual servers and desktops are added to an oversubscribed SAN, the SAN quickly becomes a performance bottleneck if it is not adjusted for the virtualization oversubscription. For example: a typical SAN oversubscription rate is 8:1, or 8 physical server initiator ports to 1 storage target port. If the average number of VMs on a virtual server is 10 or a 10:1 ratio, then the total VM to target storage port is 80:1. Obviously an 80:1 oversubscription ratio is going to have performance problems. Avoiding the SAN bottleneck requires coordination between the virtualization administrator, SAN administrator, storage administrator, and facilities administrator because each discipline is its own administrative silo. The scheduled planning, coordination, and time (mostly the time) are the primary reasons virtualization administrators are so frustrated with storage.  Fig 10: SAN Bottleneck Organizational Administrative Silos Most IT organizations have administrative silos for virtualization, applications, networks, storage, storage networks, facilities, etc., is because the knowledge and experience requirements for each one is vast. Those that don’t expect their admins to have skills they often don’t have. As previously discussed, virtualization admins get exasperated by the constant planning, coordination, and time required to work with the other admins. But they become quickly discouraged when they can control everything including the storage and storage Fig 11: Wasted Time Dragon Slayer Consulting © 2012 All Rights Reserved • Q3 2012 7 WHITE PAPER • How to End Virtualization Administration Storage Frustration networking. This is because they lack the knowledge, skills, and experience they need. It is very disheartening and begins to feel like a “no-win” scenario. The work around most often deployed by virtualization administrators is NAS. VMware has NFS (network file system) built into the vSphere kernel. Microsoft Hyper-V has CIFS (computer internet file system) built into its kernel. NAS allows the virtualization admin to forget about LUNs, SAN, pathing, oversubscriptions, etc. because it is file based storage. It doesn’t need any of those things. Just set up the file store for each virtual machine or desktop, mount it and it’s done. The trade-off to this common easy workaround is a significant performance loss. Latency is an order of magnitude higher. NFS or CIFS metadata can be as much as 90% of the NAS IOPS causing serious CPU bottlenecks for application data. Virtualization admins Fig 12: File Storage (NAS) love the simplicity and hate the performance of this work around.  Data Protection, Business Continuity, and Disaster Recovery Data protection, business continuity, and disaster recovery (DR) are essential to the vast majority of IT organizations, especially today in an increasingly regulatory climate. Yet these disciplines tend to be highly uncoordinated with large amounts of functional and effort duplication. Virtual servers and desktops are snapped and replicated as well as backed up. Storage systems are snapped and replicated. Databases are mirrored (or continuously protected on every write) synchronously and/or asynchronously. Different administrators handle all of these data protection functions with little or no knowledge of what the other admins have done. This can and often does create chaos during recoveries and business continuity. Data is recovered multiple times wasting valuable time and too much duplicate efforts. It results in unacceptably increased times to be up and running. It is analogous to wearing 2 sets of underwear, pants, belt, suspenders, and coveralls, while having dysentery. Not a pretty picture. Fig 13: Different Admins & Tools rd One work around has been to use a 3 party data protection software management catalogue consolidation providing overall reporting on most of the data protection systems in place. This helps deliver information required to minimize duplication of effort for recoveries, but does nothing to minimize that duplication of effort in the protection. And it adds yet another layer of software requiring time-consuming management.  There Must Be a Better Way to Eliminate or Mitigate Storage Barriers Plaguing Virtualization Performance The key to solving these problems is the storage. An effective solution must offer the virtualization administrator:       Storage control without requiring any storage expertise; Virtual machine and virtual desktop the consistent performance required without high cost or the need for professional services; Intuitive hypervisor integrated operations and management; Eradication of knotty performance problems such as LUN or SAN oversubscription, file metadata IO latencies, and iSCSI processing latencies; Data protection/business continuity/DR duplication elimination or minimally mitigation; Cost effectiveness. Astute Networks recognized the problems and is attacking them with a series of virtualization optimized storage appliances specifically architected to do just this. Astute Networks ViSX G4 Flash Virtual Machine Optimized Storage Appliances Fig 14: ViSX G4 Astute Networks purpose built from the ground up the ViSX G4 storage appliances to specifically address each and every one of the storage barriers inhibiting virtualization and mission critical application Dragon Slayer Consulting © 2012 All Rights Reserved • Q3 2012 8 WHITE PAPER • How to End Virtualization Administration Storage Frustration performance. It starts with a clean sheet of paper and looking at the problems with a fresh set of eyes. Solving only one aspect of the problems will shift the bottleneck elsewhere leaving both the users and admins frustrated. The problems must be solved holistically as a package, and that is what the ViSX does.  No HDD 100% Flash SSD Appliances First it eliminates all HDDs (e.g. there are no spinning disks). Each ViSX G4 is 100% Flash SSD (no LUN oversubscription issues). The Flash itself is high performance eMLC that offers performance and write cycle life similar to SLC Flash but at cost much closer to MLC. The ViSX G4 then combines those fast eMLC Flash SSDs with its patented unique DataPump™ Engine (ASIC) that offloads and accelerates TCP/IP and iSCSI protocol processing. Utilizing 1 and 10G Ethernet iSCSI eliminates most of SAN issues as well as oversubscription. But it is the architecture and performance of the DataPump Engine that eliminates most of the iSCSI performance issues.  Unprecedented IOPS and Throughput The DataPump Engine eliminates both network and storage IO bottlenecks. The DataPump Engine ASIC eliminates the iSCSI network IO bottleneck by processing TCP/IP and iSCSI packets in addition to commands much faster than can be accomplished with software stacks running on commodity processors. The result is unequalled extremely low round trip network latencies on 1G or 10G standard Ethernet. That offloading plus a highly optimized software suite that marshals Fig 15: DataPump Engine data to a high performance RAID controller has the additional benefit of allowing the CPU to focus on serving up storage for for read/writes. This combination maximizes sustainable Flash IOPS and throughput performance and enables unprecedented IOPS per dollar. It is Astute’s holistic design that permits the VISX G4 to outperform conventional Ethernet deployed Flash or hybrid storage systems by as much as 5 to 10X.  Lower Than Expected TCO Cost always seems to be a factor with 100% SSD based storage systems. Astute makes SSD cost a nonissue with ViSX G4 appliances by utilizing both eMLC and a very advanced primary data deduplication algorithm. That algorithm offers very high rates of that intensely increases storage capacity and, unlike nearly all other storage solutions, (Flash, hybrid, or disk-based), has zero impact on performance. These factors make the ViSX G4 appliances upfront CapEx costs on a par with hybrid or HDD based storage systems. But based on a total cost of ownership, the ViSX G4 is typically considerably less because of the very low amounts of power and cooling required as well as minimal storage software costs. Which leads to that other virtualization administrator frustration, control.  Purpose Built for the Virtualization Administrator The ViSX VM storage appliances are designed primarily for virtualization administrators. Right out the box it has pre-configured LUNs and RAID so no expertise is required for provisioning storage for virtual machines or desktops. However, if the virtualization administrator has some storage system knowledge and wants to make changes, they can. Each ViSX appliance is intuitive and manageable by a VMware vCenter plug-in for vSphere environments, or by its FlashWRX™ GUI for other hypervisor platforms such as Microsoft Hyper-V, Citrix XenServer, and RHEV, making it feel like a virtualization feature. Astute then did something quite smart by not duplicating the data protection, business continuity, and DR capabilities within the hypervisors. Each ViSX appliance takes advantage of the virtualization software’s snapshots, backup, replication, HA, data migration, and DR. No duplication of effort in protecting or recovering data, no wasted cost on duplicated software, no wasted time recovering data. And the ViSX is not a “Rip-Out-And-Replace” storage solution. It’s complimentary to the current SAN and NAS storage ecosystem. Dragon Slayer Consulting © 2012 All Rights Reserved • Q3 2012 9 WHITE PAPER • How to End Virtualization Administration Storage Frustration  Enterprise Class Reliability It all sounds great, but in the end this is still storage. All storage systems, just like physicians, must have the mantra of: “first do no harm.” What are the system reliability, data durability, data resilience, and availability assurances? The ViSX G4 appliances are the only virtualization-optimized storage with four levels of reliability built into its DNA: 1. On chip Flash ECC – Production hardened NAND (Flash) chip based error detection and correction. 2. eMLC – Enterprise-grade multi-level cell flash devices combines SLC like extra reliability (extending the life of flash-based modules to 10 years or more) and SLC like high performance to with the low cost of MLC. 3. SSD RAID Levels – Multiple RAID choices including 0, 1, 10, 5, and 6 are supported with extremely fast rebuild times due to the solid state architecture. 4. Flash Module Chip Redundancy – ViSX goes beyond traditional ECC data protection by using redundant flash chips to improve both overall reliability and write performance by a factor of 100 over other devices. The result is an Enterprise class storage appliance, purpose built for the virtualization administrator, that eliminates the common virtualization and mission critical application performance barriers. And it does all of this without the sticker shock. Fig 16: ViSX G4 Conclusion There are many storage performance barriers facing virtualization admins. Some are architectural. Some are organizational. Some are absolutely perplexing. All are extraordinarily aggravating. Astute Networks with its family of ViSX appliances removes those barriers at an incredible IOPs/$. For more detailed information, please contact: info@astutenetworks.com or go to http://www.astutenetworks.com About the author: Marc Staimer is the founder, senior analyst, and CDS of Dragon Slayer Consulting in Beaverton, OR. The consulting practice of nearly 14 years has focused in the areas of strategic planning, product development, and market development. With over 32 years of experience in infrastructure, storage, server, software, and virtualization, he’s considered one of the industry’s leading experts. Marc can be reached at marcstaimer@me.net. Dragon Slayer Consulting © 2012 All Rights Reserved • Q3 2012 10