Thursday, October 16, 2008

IBM's comparison of virtual machines...

IBM just posted a comparison of virtual machines that I find annoyingly flawed. Fortunately, discussion on OSnews showed that technical people don't buy this kind of "data". Still, I thought that there was some value in pointing out some of the problems with IBM's paper.

nPartitions are tougher than logical partitions

What HP calls 'nPartitions' are electrically isolated partitions. IBM correctly indicates that nPartitions offers true electrical isolation. The key benefit, using IBM's own words, is that nPartitions allow you to service one partition while others are online. You can for example extract a cell, replace or upgrade CPUs or memory, and re-insert the cell without downtime.

The problem is when IBM states that this is similar to IBM logical partitioning. In reality, electrically-isolated partitions are tougher. Like on IBM systems, CPUs can be replaced in case of failure. But you can replace entire cells without shutting down the system, contrary to IBM's claim that systems require a reboot when moving cells from one partition to another.

Finally, IBM remarks that Another downside is that entry-level servers do not support this technology, only HP9000 and Integrity High End and Midrange servers. Highly redundant, electrically isolated partitions have a cost, so they don't make so much sense on entry level systems. But it's not a downside of HP that they only have partitioning technologies similar to IBM's on entry level systems, it's a downside of IBM not to have anything similar to nPartitions on their high-end servers.

Integrity servers support more operating systems

IBM writes: It's important to note that while nPartitions support HP-UX, Windows®, VMS, and Linux, they only do so on the Itanium processor, not on the HP9000 PA Risc architecture. It is true that HP (Microsoft, really) doesn't support Windows on PA-RISC, a long obsolete architecture, but what is more relevant is that IBM doesn't support Windows on POWER even today. This is the same kind of spin as for electrically-isolated partitions: IBM tries to divert attention from a missing feature in their product line by pointing out that there was a time when HP didn't have the feature either.

IBM writes that Partition scalability also depends on the operating system running in the nPartition,. Again, this is true... on any platform. But the obvious innuendo here is that the scalability of Integrity servers is inconsistent across operating systems. A quick visit to the TPC-H and and TPC-C result pages will demonstrate that this is false. HP posts world-class results with HP-UX and Windows, and other Itanium vendors demonstrate the scalability of Linux on Itanium.

By contrast, the Power systems from IBM or Bull all run AIX. And if you want to talk about scalability, IBM as of today has no 30TB TPC-H results, and it often takes several IBM machines to compete with a single HP system.

Regarding scalability of other operating systems, HP engineers have put a lot of effort early into Linux scalability. I would go as far as saying that 64-way scalability of Linux is probably something that HP engineers tested seriously before anybody else, and they got some recognition for this work.

HP offers both isolation and flexibility

IBM also tries to minimize the flexibility of hard partitions. If this is electrically isolated, it can't be flexible, right? That's exactly what IBM wants you to believe: They also do not support moving resources to and from other partitions without a reboot.

However, HP has found a clever way to provide both flexibility and electrical isolation. With HP's Instant Capacity (iCap), you can move "right to use" CPUs around. On such a high-end machine, it is a good idea to have a few spare CPUs, either in case of CPU failure or in case of a sudden peak in demand. HP recognized that, so you can buy machines with CPUs installed, but disabled. You pay less for these CPUs, you do not pay software licenses, and so on.

The neat trick that iCap enables is to transfer a CPU from one nPartition to another without breaking electrical isolation: you simply shut it down in one partition, which gives you the right to use it in another partition without paying anything for it. All this can be done without any downtime, and again, it preserved electrical isolation between the partitions.

Shared vs. dedicated

Regarding virtual partitions, IBM is quick to point out the "drawbacks": What you can't do with vPars is share resources, because there is no virtualized layer in which to manage the interface between the hardware and the operating systems. Of course, you cannot do that with dedicated resources on IBM systems either, that's the point of having a partitioning system with dedicated resources! If you want shared resources, HP offers virtual machines, see below. So when IBM claims that This is one reason why performance overhead is limited, a feature that HP will market without discussing its clear limitations, the same also holds for dedicated resources on IBM systems.

What is true is that IBM has attempted to deliver in a single product what HP offers in two different products, virtual partitions (vPars) and virtual machines (HPVM). The apparent benefit is that you can configure systems that have a mix of dedicated and shared resources. For example, you can have in the same partition some disks directly attached to a partition like they would be in HP's vPars, and other disks served through the VIO in a way similar to HPVM. In reality, though, the difference between the way IBM partitions and HPVM virtual machines work is not that big, and if anything, is going to diminish over time.

A more puzzling statement is that The scalability is also restricted to the nPartition that the vPar is created on. Is IBM really presenting as a limitation the fact that you can't put more than 16 CPUs in a partition on a 16-CPU system? So I tested that idea. And to my surprise, IBM does indeed allow you to create for example a system with 4 virtual CPUs on a host with 2 physical CPUs. Interestingly, I know how to create such configurations with HPVM, but I also know that they invariably run like crap when put under heavy stress, so HPVM will refuse to run them if you don't twist its arm with some special "force" option. I believe that it's the correct thing to do. My quick experiments with IBM systems in such configurations confirms my opinion.

IBM also comments that There is also limited workload support; resources cannot be added or removed. Resources can be added or removed to a vPar, or moved between vPars, so I'm not sure what IBM refers to. Similarly, when IBM writes: Finally, vPars don't allow you to share resources between partitions, nor can you dynamically allocate processing resources between partitions, I invite readers to check the transcript of a 3-years old demo to see for how long (at least) that statement has been false...

Virtual Machines

Finally, IBM also gives false or outdated information regarding HP Integrity Virtual Machines, aka HPVM (not "IVM", which is an IBM product.)

HPVM now supports 8 virtual CPUs, not 4, and it has always been able to take advantage of the much large numbers of physical CPUs HP-UX supports (128 today). Please note that 8 virtual CPUs is what is supported, not what works (the red line in the graph shows 16-way scalability of an HPVM virtual machine under moderate competition using a Linux scalability benchmark; the top blue line is the scalability on native hardware; the bottom curves are HPVM under maximal competition and Xen under no competition.)

IBM also states that Reboots are also required to add processors or memory. , but this is actually not very different from their own micropartitions. In a Linux partition with a kernel supporting CPU hotplug, for example, you can disable CPU3 under HPVM by typing echo 0 > /sys/devices/system/cpu/cpu3/online, just like you would on a real system (on HP-UX, you would use hpvmmgmt -c 3 to keep only 3 CPUs). HPVM also features dynamic memory in a virtual machine, meaning that you can add or remove memory while the guest is running. You need to reboot a virtual machine only to change the maximum number of CPUs or the maximum amount of memory, but then that's also the case on IBM systems: to change the maximum number of CPUs, you need to modify a "profile", but to apply that profile, you need to reboot. However, IBM micro-partitions have a real advantage (probably not for long) which is that you can boot with less than the maximum, whereas HPVM requires that the maximum resources be available at boot time.

IBM wants you to believe that There is no support for features such as uncapped partitions or shared processor pools. In reality, HPVM has always worked in uncapped mode by default. Until recently, you needed HP's Global Workload Manager to enable capping, because we thought that it was a little too cumbersome to configure manually. Our customers told us otherwise, so you can now directly create virtual machines with capped CPU allocation. As for shared processor pools, I think that this is an IBM configuration concept that is entirely unnecessary on HP systems, as HPVM computes CPU placement dynamically for you. Let me know if you think otherwise...

The statement IBM made that Virtual storage adapters also cannot be moved, unless the virtual machines are shut down. is also untrue. All you need is to use hpvmmodify to add or remove virtual I/O devices. There are limits. For example, HPVM only provisions a limited number of spare PCI slots and busses, based on how many devices you started with. So if you start with one disk controller, you might have an issue going to the maximum supported configuration of 158 I/O devices without rebooting the partition. But if you know ahead of time that you will need that kind of growth, we even provide "null" backing stores that can be used for that purpose.

But the core argument IBM has against HP Virtual Machines is: The downside here is scalability. Scalability is a complex enough topic that it deserves a separate post, with some actual data...

Preserving the investment matters

Let me conclude by pointing out an important fact: HP has a much better track record at preserving the investment of their customers. For example, on cell-based systems, HP allowed mixed PA-RISC and Itanium cells to facilitate the transition.

This is particularly true for virtual machines. HPVM is software, that HP supports on any Integrity server running the required HP-UX (including systems by other vendors). By contrast, IBM micropartitions are firmware, with a lot of hardware dependency. Why does it matter? Because firmware only applies to recent hardware, whereas software can be designed to run on older hardware. As an illustration, my development and test environments consists in severely outdated zx2000 and rx5670 (the old 900MHz version). These machines were introduced in 2002 and discontinued in 2003 or 2004. In other words, I develop HPVM on hardware that was discontinued before HPVM was even introduced...

By contrast, you cannot run IBM micro-partitions on any POWER-4 system, and you need new POWER-6 systems in order to take advantage of new features such as Live Partition Mobility (the ability to transfer virtual machines from one host to another). HP customers who purchased hardware five years ago can use it to evaluate the equivalent HPVM feature today, without having to purchase brand new hardware.

Update: About history

The IBM paper boasts about IBM's 40-plus year history of virtualization, trying to suggest that they have been in the virtualization market longer than anybody else. In reality, the PowerVM solution is really recent (less than 5 years old). Earlier partitioning solutions (LPARs) were comparable to HP's vPars, and both technologies were introduced in 2001.

But a friend and colleague at HP pointed out that a lot of these more modern partitioning technologies actually originated at Digital Equipment, under the name OpenVMS Galaxy. And the Galaxy implementation actually offered features that have not entirely been matched yet, like APIs to share memory between partitions.

2 comments:

Anonymous said...

Glad to see that someone is at least trying to stop the marketing FUD.

esxibackups said...

Virtual Machine is also sometimes called "a computer within a computer." This is an environment does not exist physically, but has been created in another environment. And considered a when the environment has been created called the "host"