The recently held Virtualization Technology in Distributed Computing workshop provided a good illustration of the intensity of interest and diversity of communities involved. We were surprised - and thrilled - to see how popular the workshop was, especially since we gave very little lead time and many people simply did not find out about it until late. We received excellent submissions from all over the world and the participation exceeded our expectations: at the end of the day -- the last day of SC06 -- after the place had practically shut down, we still had ~50 die-hards arguing, shaking hands, and exchanging cards. Another interesting factor was the breadth of the submissions: they covered a range from fine-grained analysis of virtual machine management tools relevant to distributed computing to virtual Grids - electronic proceedings are available online at the workshop site.
The workshop got strong research submissions from both industry and academia as well as practical adoption stories from both communities. Among the highlights was an interesting paper from Intel dissecting the performance of Xen networking. A wonderful adoption scenario was represented in the work from the University of Marburg where suspend/resume properties of VMs are being used to improve backfill strategies in the local scheduler - computations running in VMs are simply suspended when a large parallel job is scheduled to run and resumed afterwards. The remarkable part of this work was that it was very much requirement-driven and has been voted into production by users. Another interesting talk came from the Australian Partnership for Advanced Computing (APAC) described their experiences using virtual machines in production Grids for a couple of years now.
The adoption stories in particular are of timely interest - this is where the main battle for virtualization in Grid computing is fought right now. In industry the idea seems to have been eagerly embraced - virtualization is widely used for hosting, datacenter management and others. Tim Freeman, who works with me on the workspace project, recently published a list of some 20 companies providing Xen-based hosting. The interest in virtualization as a new mode of resource provisioning in the scientific community is less pronounced - the established methods of resource provisioning are hard to change, the incentives are not as clear, and without an established user base the providers are understandably hesitant to try this new mode of resource provisioning. However, here too things are changing -- at a recent LCG workshop a roomful of site administrators agreed that less than five years from now there will be few (if any) non-virtualized systems. This workshop also brought out some exciting adoption stories including Grid Ireland which has been running a fully virtualized infrastructure for a while.
Another aspect of adoption is the need to provide content, that is building the virtual appliances - applications bundled with the environment they need (e. g., VM images) -- that will run on virtualized resources. Simply building appliance databases or marketplaces and encouraging sharing is not enough. Environments "age" - security patches and other updates sometimes need to be applied as often as daily to ensure the integrity of your system. In other words: we not only need a scalable way of generating appliances but also of managing and updating them, preferably in a modular way that would allow sharing configuration components between images and finishing configuration dynamically if necessary. And then of course there is the issue of building trust in an image - making sure that deploying that image won't subject your site to improper use - which typically means different things to different people. For that to happen there need to be ways of attesting and signing an appliance such that it can be verified on deployment. Fortunately, configuration management tools have been around for a while - and companies like rPath are applying those methods to generate and manage virtual appliances.
The ability to generate appliances reliably and automatically is a critical missing link to adoption. Without it, users who never had to create or maintain their own images before, find virtualization a significant investment. Without multiple appliances and users, the providers find it equally hard to justify deployment. This creates a "chicken and egg" from which we are only now beginning to break out. In a wider context, we need to look ahead to challenges that will emerge once virtualization does gain widespread adoption - what challenges will we face when we have potentially many VM images to a physical resource - and this is not the sort of question that we can run a simulation to answer.
Read the original article, here, written by Kate Keahey of Argonne National Laboratory