Virtualization improvements in Fedora 12
Fedora 12 includes a number of improvements in the field of Virtualization. New tools enable system administrators to perform nearly impossible - until now - tasks easily. Imagine re-configuring a virtual machine off-line, add new hardware to VM with out restarting it, migrate to another host without restarting the VMs and many other exotic features. Let's hear what developers have to say about those wonderful new options.
- Chris Wright (KVM Huge Page Backed Memory)
- John Cooper (KVM Huge Page Backed Memory)
- Mark McLoughlin (KVM Stable Guest ABI and KVM NIC Hotplug)
- Kevin Wolf (KVM qcow2 Performance)
- David Lutterkort (Network Interface Management)
- Daniel Berrange (VirtPrivileges)
- Glauber Costa (VirtgPXE)
- Dave Allan (VirtStorageManagement)
- Richard Jones (libguestfs)
Interviews were conducted online on October 22, 2009. The full IRC transcript from which this interview series was extracted is available here.
Richard Jones on guestfish and friends (libguestds and libvirt)
Mel Chua: Why don't we start with everyone introducing themselves briefly, and giving a sentence or two about what they do, and what virt features they worked on for F12?
Richard Jones: I'm a software engineer at Red Hat, and I am working on http://libguestfs.org/. libguestfs is a set of tools which you can use to examine and modify virtual machine images from outside (ie. from the host), so for example if you had an unbootable guest, you could try to fix it by doing: virt-edit myguest /boot/grub/grub.conf
Mel Chua: What would sysadmins have to do to fix that before libguestfs arrived?
Richard Jones: that's really tricky ... it was sort of possible using tools like kpartx and loopback mounts, but it was dangerous stuff, hard and you had to be root. now there's no root commands needed, and it's organized as nice little command line tools for each task with proper manual pages. I'd point people to the home page -- http://libguestfs.org/ -- to see lots of examples, and documentation.
Mel Chua: How do libguestfs capabilities in Fedora compare with how a sysadmin might do the same thing on other, non-Linux (or linux-but-on-another-distribution) platforms? Are there other similar tools?
Richard Jones: we've worked with Guido Gunther from Debian on getting a parts of libguestfs packaged up for Debian. On Windows, Microsoft offers something called DiscUtils.Net which is similar but not nearly as powerful. So I'm confident Fedora is well ahead of everyone here.
Mel Chua: Do you want to talk about the guestfish interface a bit?
Richard Jones: Sure. guestfish is one of the ways to get access to the libguestfs features, for use from shell scripts. The basic usage is to do:
guestfish -i yourguest
...where yourguest is some guest name known by libvirt, and that gives you a shell where you can list files in the guest, edit them, look in directories, find out what LVs the guest has (or create new ones) ... literally 200 commands. That's all documented here: http://libguestfs.org/guestfish.1.html
Mel Chua: Wow. That documentation is gorgeous.
Richard Jones: and if you run out of ideas, we have some "recipes" you can try out with guestfish: http://libguestfs.org/recipes.html
Mark McLoughlin:: We've certainly all been put to shame by Richard's docs. :)
David Lutterkort: The power of OCaml. ;)
Mark McLoughlin on virtual upgrades to your virtual machine
Mark McLoughlin: I'm an engineer at Red Hat, joined from Sun nearly 6 years ago. Previously worked on GNOME desktop related stuff, but have been working on virtualization for the past few years. For Fedora 12, I worked on the NIC Hotplug and Stable Guest ABI features, along with packaging, bug triaging and general shepherding of all the other virt bits. I work upstream on both qemu and libvirt, but at lot of my time is taken up by Fedora work these days.
Okay, the NIC hotplug feature - the ability to add a new virtual NIC while the guest is running - was a pretty obviously missing feature from our KVM support previously. The problem we had with implementing it, is that libvirt is responsible for configuring the virtual NIC and passes a file descriptor to the qemu process when it starts it.
That's much harder to do when the guest is already running. So, most of the work involved some scary UNIX voodoo to allow passing that file descriptor between two running processes. As for use cases, people often want to add and remove hardware from their guests without re-starting them. You might want to add a guest to a new network, for example.
Now, the Stable Guest ABI feature is really quite boring, but is about preparing KVM so that we can maintain compatibility across new releases. The idea is that if you are running a Fedora 12 KVM host and you install a new host with Fedora 13, you might like to migrate your running guests from the Fedora 12 host to the Fedora 13 host, without re-starting them.
Now, as we add new features to qemu in Fedora 13, we might end up 'upgrading' the virtual machine's hardware. We might, for example, emulate a new chipset by default or add a new default NIC. The Stable Guest ABI feature means that when you migrate to the Fedora 13 host, the hardware emulated by qemu will remain the same for that guest.
As you can imagine, if you change around the hardware under a running guest, the guest may get seriously confused. But it's not just about live migration - if you upgrade your host and restart your guest, not all guest OSes will like if you've changed around the hardware. Windows, for example, with significant enough changes to the hardware, will require you to re-validate your license. We want to avoid that happening when you upgrade your Fedora host.
David Lutterkort on "Network scripts: complex no more!"
David Lutterkort: David Lutterkort, software engineer at Red Hat, worked on http://fedorahosted.org/netcf (for the Network Interface Management feature), in the past worked on ovirt and some of the virt-install tools. besides that, work some on http://deltacloud.org/, and http://augeas.net/
Network Interface Management lets sysadmins set up fairly complex network configurations (e.g. a bridge with a bond enslaved) through a simple description of the config, using the libvirt API; in the past, that required initimate knowledge of ifcfg-* files and a lot of nailbiting. Having an API also means that such setups can be done by programs (e.g., centralized virt mgmt software or virt-manager)
Mel Chua: Awesome. If I'm understanding you right, this means that now sysadmins can automate complex custom network configurations for VMs?
David Lutterkort: Complex network configs on the host, generally ... a common request is 'how do I share a physical NIC between various VM's'; in the past, you had to manually go and edit ifcfg-* files. libvirt now has an API and XML description to make that setup much easier. The backend for the libvirt interface API is netcf, which is independent of virtualization, so you could use that to setup network configs in your VM's.
Mel Chua: Ahhh, okay - thanks for the clarification. How does this compare to how people would set up host network configs on other platforms?
David Lutterkort: right now this is exposed in the libvirt API; we're working (well, Cole Robinson is working) on exposing that in virt-manager so that people can say 'use this physical NIC for all my VM's' with one click. There you either have to manually edit the network configs, which generally is only really possible for humans, not programs, or rely on the very dodgy, never-quite-right Xen networking scripts.
Mel Chua: Is there a place where our readers can go to find out more about how to use the libvirt API? How do folks try these features out?
David Lutterkort: Beside bugzilla? ;)There's a small amount of docs on the netcf site (I have to add more) and libvirt.org has API docs for the various virInterface* calls.
Mel Chua: I see instructions on how to test at https://fedoraproject.org/wiki/Features/Network_Interface_Management#How_To_Test
David Lutterkort: There's also a blog post somebody else wrote on netcf: http://linux-kvm.com/content/netcf-silver-bullet-network-configuration. I don't know of a good central place where this gets summarized, though FWN has been pretty good reporting about virt features. Besides that, watching the individual projects is everybody's best bet: libvirt, libguestfs, virt-install, virt-manager are the most important ones from a user's point of view.
Mel Chua: The user typically being a sysadmin?
David Lutterkort: virt-manager is definitely for end users, not just sysadmins; virt-install somewhere in the middle, the others get fairly technical.
Mel Chua: What would be a use-case for an end-user using virt-manager? (I'm guessing there will be users reading this interview who may not have tried out virt stuff before, but who might read this and go "ooh, hey..." and try it out.)
David Lutterkort: Try out rawhide without the risk of breaking your current system. Of course, that goes for any $OS ... in general, virt-manager is a graphical user interface to most/all virt features.
How to try out virtualization
Mel Chua: Ok - imagine I'm a new Fedora user, I've just installed F12, love it, want to get a preview of rawhide so I can see what's coming for F13. What do I need to install/run to get rawhide running in a VM?
lvcreate -n F13Rawhide -L 10G vg_yourhost; virt-install -v -n F13Rawhide --accelerate -r 512 -f /dev/vg_yourhost/F13Rawhide -c /tmp/Fedora-13-netinst.iso
Mark McLoughlin:: Hmm, no - I'd point people at virt-manager. Install the 'Virtualization' group in Add/Remove Software, go to Applications -> System Tools -> Virtual Machine Manager, then click on New VM. Choose a name for the guest, choose network install, and then add a URL like http://download.fedoraproject.org/pub/fedora/linux/releases/12/Fedora/x86_64/os/ - after that, the instructions in the wizard should be fairly self explanatory.
Some history about PXE
<Q> mchua: lutter, rwmjones, markmc, in a moment, I'd like to pull back and have the three of you talk with each other about how virt in Fedora has progressed in the past few releases.
<A> markmc: one sec - I'll cover gpxe and qcow2 features
the feature owners aren't here (in this case), okay. The gPXE feature is about replacing the boot ROMs used by qemu for PXE booting with newer versions, basically etherboot was the name of the project previously, but it's now called gPXE.
It's important that we made the switch to gPXE because all future upstream development (new features, bug fixes) will go into gPXE instead of etherboot.
The qcow2 performance feature was about taking a cold hard look at the qcow2 file format and fixing an major bottlenecks basically, we see qcow2 as a very useful format for virtual machine images e.g. the size of qcow2 files is determined by the amount of disk space used by the guest; not the entire size of the virtual disk we're presenting to the guest. The images should be smaller on disk, even if you copy them between hosts. Also, qcow2 supports a "copy on write" feature whereby you can base multiple guest images from the one base image so you can reduce disk space further by installing one guest image, creating multiple qcow2 images backed by the first image and yet, the guest can still write to their disks! So, in summary, we want more people to use qcow2, but they couldn't because the performance was poor. Kevin Wolf put serious effort in upstream to iron out those kinks and obtain a serious speedup. Figures are in a table on the feature page.
<Q> mchua: markmc, to backtrack a bit, why the switch from etherboot? - From what I've read, it sounds like the switch was actually requested by the etherboot upstream, in part.
<A> markmc: Yes, the etherboot project is no more; it is deprecated in favor of gPXE, but they're not completely identical, so there was some significant work involved ... done by Glauber Costa (our Brazilian joker) and Matt Domsch from Dell (AFAIR)
<Q> mchua: markmc, is gPXE being used by other OSes and distros too? <A> markmc: yeah, it was Matt Domsch. It may be used by other distros, I'm not 100% sure about that. I'd imagine we're slightly ahead of the curve on this - upstream qemu is still using etherboot images
Some history about virt-manager
rwmjones: I would say that in Fedora 6 which is where I really started off with Fedora, it was quite primitive and unfriendly, although we did have virt-manager which has always been a nice tool
<Q> mchua: What was the F6 virt experience like?
<A> rwmjones: Here's a guestfish example ... making a backup of /home from a Debian guest:
# guestfish -i --ro Debian5x64 Welcome to guestfish, the libguestfs filesystem interactive shell for editing virtual machine filesystems. Type: 'help' for help with commands 'quit' to quit the shell <fs> cat /etc/debian_version squeeze/sid <fs> tgz-out /home home.tar.gz
Fedora 6 -> 12 .. it's a story of everything improving dramatically. It's not really that there are big new features eg. we have virt-manager back in 6, but modern virt-manager is just far better. I've been trying to work on making it better for sysadmins who want to automate things, hence libguestfs is very shell-script / automation-friendly
<Q> mchua: So one area of improvement between F6 virt and F12 virt is "F12 virt is far more automatable and shell-script friendly."
<A> rwmjones: yeah I'd say that's true
<Q> mchua: "It's not really that there are big new features... but [features are] just far better" - so you can do the same things, more or less, just much faster (in terms of sysadmin-headache-time needed)?
<A> rwmjones: Well there are a lot of big new features behind the scenes (KVM, KSM, virtio ...). It's not clear how apparent they'll be to end users, but it will just all work better and faster. There's a story behind virt-df (http://libguestfs.org/virt-df.1.html). When I used to manage a bunch of virtual machines at my previous job, it was the tool that I wanted. It didn't exist, so at Red Hat, I wrote it.
markmc: The big change between F6 and F12 is that we've switched from Xen to KVM, but because all our work is based on the libvirt abstraction layer, the tools used in F6 for using Xen should be familiar to people using KVM in F-12. We've also put a significant emphasis on improving security over the last number of releases. Danpb has more details on the security efforts in his F-11 interview.
rwmjones: Yeah ... someone on F6 who was using virt-manager or "virsh list", will be using exactly the same commands in F12, even though the hypervisor is completely different.
markmc: And he'll also have more details wrt. the VirtPrivileges feature
lutter: libvirt, and therefore the whole virt tool stack now manages a much broader area of virt related aspects, not just VM lifecycles
markmc: Lutter has a good point - we now have tools for e.g. managing networking and storage, we also have much better support for remotely managing virtualization hosts, e.g. you can point virt-manager at a host, create a guest on that host, create storage for the guest, configure the network etc.
lutter: The tools are now a prety solid basis for datacenter virt management software like ovirt and RHEV-M
markmc: wrt. fedora virt changing over the years, we're also pushing very hard to adopt new virtualization hardware features introduced by vendors so, for example, in F-11 we introduced VT-d support and in F-12 we're introducing SR-IOV support and KVM itself is based on Intel and AMD hardware virtualization also EPT/NPT support. So yeah, we're definitely leading the field in terms of shipping support for new hardware features, e.g. AFAIK no-one else (not even other hypervisor vendors) are yet shipping SR-IOV support...
lutter: yeah, Fedora is very likely the first place where you see a lot of new hardware virt features supported in OSS, mostly since so many upstream maintainers/developers for virt-related stuff work at RH and generally push their work to Fedora 'by default' .. spin that any way you want to avoid a distro war ;)
<Q> mchua: All while maintaining a consistent, familiar interface - as rwmjones pointed out, folks using virt-manager and virsh on F6 are still using the same commands. Though now they also have the option to use additional tools like guestfish to script the process (so, alternative-but-even-easier interface).
<A> lutter: we also added the capability to deploy and build appliances (through virt-install/virt-image and the thincrust project)