Ilya Baimetov, Director of Technology at SWsoft, posted on the SWsoft SaaS Blog, his comparison of VMmark and the SPEC CPU benchmark. He writes:
- Workload mix – For some reason, VMware uses their Enterprise software (ESX) to run what seems to be an SMB scenario (mix of different workloads on the same machine). Typically, in the enterprise there is a special server configuration (CPU/RAM/Disk/Network) for each workload – web server, Exchange server, low-end DB server, high-end DB server and so on. It'd be only logical to consolidate similar workloads on the similarly configured servers – to maximize resource usage.
- Workload scores. SPEC CPU runs all the workloads sequentially, measuring the maximum performance of each workload on the machine, normalizing against the reference system and then averaging across the workloads. You get an aggregate score, but you can also find out how well the machine is suited for running a certain workload that is of the most interest for you. With VMmark, it's always a wild mix. VMmark is useless if you want to find out how well a server is suited for consolidating Exchange or SQL Server.
- Aggregate score. Because it's always a mix of workloads, each workload in a tile is throttled and never runs at the full speed. And, if the server is so powerful a single tile cannot load it fully, you are supposed to add another, which would double the number of VMs and probably skew the results. Bottom line, I'm not sure what exactly VMmark measures.
- Underlying platform – SPEC CPU runs on any hardware and any OS. The VMmark cannot be run on Virtuozzo because it executes Windows and Linux workloads in parallel on the same machine.
Ilya thinks that a better approach would be to use the SPEC CPU methodology – run multiple workloads serially and average out the results. Specifically:
- Measure maximum aggregate performance of 1, 2, 4, 8 and 16 virtual environments with the same workload on the same machine. Normalize and average.
- Repeat that for each workload and then average the results.
The benefits of this approach:
- You can easily see what the virtualization overhead is for each workload depending on the number of concurrent VMs.
- Virtuozzo will be able to run the benchmark because only one OS is used for a single benchmark run.
What do you think? Do you have any experience running these two benchmarking tools?
When I first downloaded VMmark, I was a little surprised that it required SPEC tools. With all the free benchmarking tools, and all the money and energy that VMware could have put behind creating a true virtualization benchmarking tool, I was a little disappointed that I would have to spend $1700 on two benchmarking tools to completely use VMmark.
** Update **
The original article is going to be posted on the Virtuozzo Blog. Stay tuned and watch for it. It was originally posted on their SaaS blog, but is going to be moved. Sorry for the confusion.