Quoting Processor.com
Server virtualization is one of the hottest trends in the IT industry, with the technology maturing and the market exploding. According to a recent IDC survey, the worldwide virtual machine software market grew 67% last year, on top of 64% growth in 2004, and the total virtualization market (hardware and software) is estimated to exceed $15 billion by 2009. IDC sees server consolidation as the primary driver fueling this growth, a conclusion that Greg Schulz, founder and principal of The StorageIO Group (www.storageio.com), validates from his own consulting work.
Software advances have undoubtedly made the task of deploying virtual machines running various operating systems and configurations much easier; however, processing power is but one system resource IT architects must consider when designing an application infrastructure. While virtualization and high-density servers have led to an abundance of CPU power, I/O--getting data in and out of the server cluster--is becoming a critical bottleneck. We will examine the potential I/O crisis wrought by server virtualization and look at some solutions--ways to ensure a virtual cluster maintains balanced performance.
I/O Performance Bottlenecks & Effects
Processor performance has been on an exponential growth path for decades; however, external interface speeds haven’t accelerated at nearly the same rate. According to Schulz, network and disk are the two critical I/O paths limiting system performance. He says, “You need to be cognizant of the whole environment” when designing a virtual cluster, noting that “when you concentrate servers, think about where you have created another bottleneck.”
The key parameters to watch when characterizing I/O performance are throughput (or bandwidth) and latency. Applications that tend to have a few transactions but move large amounts of data (for example, streaming media servers) will manifest throughput bottlenecks most readily, while tasks that involve many transactions with small files (for example, email or file servers) will expose latency constraints.
Potential Solutions
According to Schulz, the common approaches to I/O performance issues have been either to do nothing or to overconfigure “by throwing more hardware and software at the problem.” Given the rapid march of IT technology, such a brute force approach is easy and often effective. Storage I/O is always improving as vendors add more RAM cache on both disk drives and RAID controllers and adopt new interface standards that feature higher bandwidth. On the network side, host interface speeds have increased tenfold with 1GbE now de rigueur on all servers with 10GbE interfaces regularly used to trunk between edge switches and the network core.
While over-provisioning hardware is obviously a simple solution to I/O woes on a virtual server cluster, it’s not always adequate or efficient. Intelligent caching appliances such as SAN accelerators from companies such as Exavio (www.exavio.com) or Gear6 (www.gear6.com) can improve access to shared disk arrays, while WAN accelerators from companies such as HP (OEM from Riverbed; www.hp.com) and Juniper Networks (www.juniper.net) can offload external network access from high-traffic applications.
A completely new approach to the I/O problems of virtual server clusters is offered by Fabric7 Systems (www.fabric7.com), a new company that founder and CEO Sharad Mehrotra says is borrowing architectural concepts from the mainframe and supercomputer world. Fabric7 has developed a new class of server platform that combines elements of a high-density server farm with that of a powerful network switch. Its premier product, the Q160 server, links multiple processors with a high-speed switching fabric that allows complete virtualization of both the processing and I/O layers in a server cluster. I/O between processors flows through a dedicated switching fabric, while external LAN and SAN bandwidth can be partitioned among the various process/memory complexes. While the Q160 resembles a large blade chassis, according to Mehrotra, the similarity is only skin-deep. “Blades are physical form-factor re-engineering,” while the Fabric7 server has been designed from the ground up as a hybrid processor complex and custom switch.
Cisco (www.cisco.com) promotes an alternative technology, acquired from Topspin Communications, for virtualizing all server I/O that works with existing systems. In its scenario, servers are interconnected via a high-bandwidth, low-latency InfiniBand fabric controlled by a Server Fabric Switch. The switch acts as a gateway, relaying all SAN and LAN traffic to external storage pools or networks running their native Fibre Channel or Ethernet protocols. This topology has the advantage of amalgamating all server I/O onto a single interface and thus centralizing management of I/O resources using the server fabric switch as a control point.
Other vendors are banking on the eventual migration of 10GbE interfaces to the server as a means of combining all external I/O onto a single interface that can then be sliced and diced among various I/O paths using intelligent Ethernet switches.
Virtual servers provide IT departments a powerful tool for consolidating and efficiently utilizing compute power; however, this very concentration leads to bottlenecks in moving data in and out of the cluster. Although over provisioning I/O paths and bandwidth can be an effective short-term solution, it can negate some of the efficient improvements gained through virtualization. A more comprehensive solution involves virtualizing the I/O resources themselves through the creation of a switched fabric hosting all network and storage traffic. Whether using fully integrated solutions, such as Fabric7’s server, or an external device, such as Cisco’s server switch, systems designers have some powerful new tools for provisioning and scaling I/O in today’s increasingly dense server farms.
Read the original, here.