Virtualization Technology News and Information
The Road to Cluster Virtualization

Quoting HPC Wire

As cluster use in enterprises grows, the need for commercial grade high performance computing that scales on-demand to adapt to ever-changing workload requirements and provide optimal system utilization is also growing. These needs in turn have driven many useful innovations. However, there has remained a fundamental assumption that a cluster or grid configuration is provisioned as a static, disk-based, full operating system installation on every single server.

This assumption leads to extensive scripting and middleware in an attempt to abstract out the complexity of managing hundreds or thousands of servers. In reality, this outdated approach only masks the complexity, without removing the underlying problem, and magnifies the operating costs of managing and maintaining large pools of servers. Rethinking these fundamental concepts can yield surprising results that can eliminate the very complexities many software "solutions" strive to merely camouflage. The key is Cluster Virtualization.

What is Cluster Virtualization? The term "virtualization" is a heavily used term these days, but the most common understanding of it is "to abstract the complexities of many -- presenting the simplicity of one". The result of Cluster Virtualization is the vastly simplified deployment and management of large pools of servers, accomplished by making very large groups of servers appear and act like a single system, as easy to manage as a single workstation. The financial and efficiency benefits of this approach are extremely compelling -- making Cluster Virtualization the most practical and cost-effective methodologies for reducing the complexity, cost and overall administrative burden of large scale computing -- enabling you to get the most out of your computing resources.

Today most clusters are based on the Beowulf design developed by Thomas Sterling and Donald Becker, chief technology officer at Penguin Computing, while the two were at NASA. A Beowulf cluster is a group of usually identical commercial off the shelf (COTS) computers running Linux and other open source software, to create a straightforward, scalable platform at from one tenth to one third the capital cost of traditional supercomputer.

What Becker realized, and what lead to the development of Cluster Virtualization architectures such as Scyld ClusterWare, was that while the original approach was straightforward and cost effective on the capital side, the complexity and operational costs grew in direct proportion to the size of the cluster. He found that by re-architecting the foundation of cluster software based on three basic principles, the complexity and thus the cost could be dramatically reduced. Those principles are:

  • Employing "stateless" (disk-less) provisioning.
  • Provisioning a lightweight compute operating environment.
  • Employing a single virtualized process space for the entire cluster.

Leveraging these architectural concepts has a tremendous ripple effect on rapid provisioning, manageability, scalability, security and reliability within the cluster. The result is an elegantly simple and powerful new paradigm for clustered computing, eliminating multiple levels of cost and support, while dramatically increasing efficiency and reducing operating costs to deliver a dependable HPC service to your organization.


Making Virtualization Real

The whole point of this architecture is to make large pools of servers act and feel as if they were a single, consistent, virtual system. For example, Scyld ClusterWare employs a powerful technique, built upon the standard, out-of-the-box enterprise Linux distributions, to create "single system image" behavior with the Linux you already know. It does this by extending the Linux configuration on the Master node to have a single unified process space. From both the administrator and the user point of view, a 100-node cluster with 400 processors appears very much like a 400-processor SMP machine at the cost of commodity Linux x86 cluster computing.

The compute servers are fully transparent and directly accessible if need be, but the entire compute capacity is presented at the single Master node. Consider the example of the everyday task of issuing the ubiquitous "ps" process list command. What you get back is a listing of all processes running on all machines as if it were just one machine. You can still tell which processes are running where in the cluster if needed. Other standard Linux commands work in the same intuitive way as on a single machine.

You add users and set up passwords only on the Master. You submit jobs only on the Master and simply tell it how many processors you need (even non-MPI jobs). You terminate jobs only on the Master and automatically cleaned up on the compute nodes. Of course you can run jobs or general commands on specific nodes if you need to. If you need to see the vital statistics of load, memory usage, disk usage, etc. on any or all nodes, it is one command line or GUI invocation on the Master node and you get it.

With a virtualized cluster, you focus on the work throughput you need to achieve, not the fact that you have a cluster that requires massive scripting to iterate commands over each and every node.

The Cluster Virtualization Advantage

We began this article by noting that the top challenges facing any enterprise managing a guaranteed high performance computing service are getting the right solutions to deploying and managing these resources, scaling on-demand to ever-changing workload requirements, and achieving the highest utilization levels matched to business priorities.

Built upon industry standard Linux distributions, Cluster Virtualization extends the operating system platform to deliver an elegantly simple and powerful new paradigm of clustered computing. This new paradigm eliminates the need for multiple levels of cost and support and delivers everything needed for users and administrators to be productive immediately, running HPC applications out of the box. It also dramatically increases efficiency and reduces operating costs while delivering a dependable HPC service to your organization, thereby maximizing the return on investment for Linux clustering in your highly competitive business environment.

Read the entire original article, here.

Published Friday, September 08, 2006 6:42 AM by David Marshall
Filed under:
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<September 2006>