From Fedora Project Wiki
(moved to FeatureReadyForFesco, ticket #620)
No edit summary
 
(3 intermediate revisions by 2 users not shown)
Line 11: Line 11:
== Current status ==
== Current status ==
* Targeted release: [[Releases/16 | Fedora 16 ]]  
* Targeted release: [[Releases/16 | Fedora 16 ]]  
* Last updated: 05-07-2011
* Last updated: 21-09-2011
* Percentage of completion: 60%
* Percentage of completion: 100%


== Detailed Description ==
== Detailed Description ==
Virtual machines running via the QEMU/KVM platform do not currently acquire any kind of lock when starting up. This means it is possible for the same virtual machine to be accidentally started more than once, or for the same disk image to be accidentally added to two different virtual machines. The result of such a mistake is likely to be catastrophic destruction of the virtual machines filesystem.
Virtual machines running via the QEMU/KVM platform do not currently acquire any kind of lock when starting up. This means it is possible for the same virtual machine to be accidentally started more than once, or for the same disk image to be accidentally added to two different virtual machines. The result of such a mistake is likely to be catastrophic destruction of the virtual machines filesystem.


The virtual machine lock manager is a framework embedded in the libvirtd daemon that allows for pluggable locking mechanisms. Out of the box, libvirt will provide a daemon "virtlockd" that will maintain locks for all running virtual machines on a host. This will protect against adding the same disk to two different virtual machines, and protect against libvirtd bugs where it might "forget" about a previously running virtual machine. If the administrator mounts a suitable shared filesystem (eg, NFS) in /var/lib/libvirt/lockd then the lock manager protection will be extended to all hosts shared that filesystem.
The virtual machine lock manager is a framework embedded in the libvirtd daemon that allows for pluggable locking mechanisms. The first available plugin introduced in F16, integrates with the 'sanlock' program. This will protect against adding the same disk to two different virtual machines, and protect against libvirtd bugs where it might "forget" about a previously running virtual machine. If the administrator mounts a suitable shared filesystem (eg, NFS) in /var/lib/libvirt/lockd then the lock manager protection will be extended to all hosts shared that filesystem.


There will also be a separate, 3rd party, lock manager implementation available called "sanlock". This is expected to be the subject of a separate Fedora feature, so will not be discussed here further.
Later Fedora releases will introduce alternative lock manager implementations.


== Benefit to Fedora ==
== Benefit to Fedora ==
Line 25: Line 25:


== Scope ==
== Scope ==
The changes are confined to the libvirt package. It will include
The changes are confined to the libvirt and sanlock packages


  - A new daemon 'virtlockd' with systemd service + socket files
  - The new 'sanlock' RPM is introduced to Fedora
  - virtlockd will be enabled by default on all hosts currently running 'libvirtd'
  - The new 'libvirt-locking-sanlock' sub-RPM is introduced to the libvirt.spec file
  - The /etc/libvirt/qemu.conf file will gain a configuration parameter to set the lock manager implementation
  - The /etc/libvirt/qemu.conf file will gain a configuration parameter to set the lock manager implementation
- A new /etc/libvirt/qemu-sanlock.conf file is introduced for sanlock lock manager configuration


== How To Test ==
== How To Test ==
There are no special hardware requirements for testing this feature, beyond those already required for running QEMU/KVM virtual machines.
There are no special hardware requirements for testing this feature, beyond those already required for running QEMU/KVM virtual machines.
=== General host setup ===
Install libvirt, KVM, etc as per normal practice. Additionally install the 'augtool', 'libvirt-lock-sanlock' and 'sanlock' RPMs using yum
The sanlock plugin requires a directory in which it will store leases. For single host protection, this directory can be a local filesystem, but for cross-host protection it needs to be a network filesystem like NFS, or cluster filesystem like GFS. By convention the directory should be '/var/lib/libvirt/sanlock'.
Each host that shares the same filesystem for leases, needs to be allocated a *unique* host ID, between 1 and 512.
With this in mind the basic configuration for sanlock can be done with the following augeas commands:
  $ augtool
  augtool> set /files/etc/libvirt/qemu.conf/lock_manager "sanlock"
  augtool> set /files/etc/libvirt/qemu-sanlock.conf/host_id 1
  augtool> set /files/etc/libvirt/qemu-sanlock.conf/auto_disk_leases 1
  augtool> set /files/etc/libvirt/qemu-sanlock.conf/disk_lease_dir "/var/lib/libvirt/sanlock"
  augtool> save
  Saved 1 file(s)
  augtool> quit
Obviously, change the 'host_id' line to give a unique value for the host.
By default sanlock uses a software watchdog to ensure that the host is automatically hard rebooted if something goes wrong. In testing this is not very nice, so disable the sanlock watchdog and then start the sanlock daemon
  $ echo 'SANLOCKOPTS="-w 0"' > /etc/sysconfig/sanlock
  $ service sanlock start


=== Single host testing ===
=== Single host testing ===


  - Install the standard libvirt + QEMU/KVM virtualization packages
  - Follow the 'General host setup' instructions
- Restart the libvirtd daemon
  - Provision two virtual machines
  - Provision two virtual machines
  - Create a third disk image  (eg dd if=/dev/zero of=/var/lib/libvirt/images/extra.img bs=1M count=100)
  - Create a third disk image  (eg dd if=/dev/zero of=/var/lib/libvirt/images/extra.img bs=1M count=100)
Line 58: Line 86:
=== Dual host testing ===
=== Dual host testing ===


  - Install the standard libvirt + QEMU/KVM virtualization packages on both hosts
  - Follow the 'General host setup' instructions, on both hosts
  - Mount an NFS volume at /var/lib/libvirt/lockd on both hosts
  - Mount an NFS volume at /var/lib/libvirt/sanlock on both hosts
  - Restart the virtlockd service
  - Restart the libvirtd daemon on both hosts
  - Provision a virtual machine
  - Provision a virtual machine
  - Copy the virtual machine configuration to the second host
  - Copy the virtual machine configuration to the second host
Line 106: Line 134:
otherwise result in the same disk image being run twice
otherwise result in the same disk image being run twice


In the event of a total virtualization host failure, the NFS server may still hold locks for the dead host which will not be released. This will prevent VMs being started on a new host. To recover from this scenario, ensure the dead host is truely dead (hardware cluster fencing agents are a good option). Then manually force the release of locks from the dead host on the NFS server.


== Dependencies ==
== Dependencies ==
Line 114: Line 141:
== Contingency Plan ==
== Contingency Plan ==


In the event of the virtlockd daemon not working as expected, the default libvirt driver configuration will be changed to use the 'nop' lock manager. This is a lock manager which does nothing, and so is equivalent to the functionality of previous Fedora releases.
The use of 'sanlock' is an explicit adminsitrator 'opt in', thus no contingency plan is required. The user can simply run without a lock manager, in which case the behaviour will be identical to previous Fedora releases.


== Documentation ==
== Documentation ==
* http://libvirt.org/locking.html   NB: not yet updated to describe virtlockd
The primary upstream documentation is at
 
* http://libvirt.org/locking.html


== Release Notes ==
== Release Notes ==


* The QEMU/KVM virtualization driver in libvirt now enforces exclusive access to the virtual machine disk images on a single host. This prevents multiple guests being started with the same disk image, unless the <shareable/> flag is set for the disk
* The QEMU/KVM virtualization driver in libvirt includes an optional lock manager plugin to enforce exclusive access to the virtual machine disk images on a single host. This prevents multiple guests being started with the same disk image, unless the <shareable/> flag is set for the disk
* At administrator discretion, a shared filesystem (eg NFS) can be mounted at /var/lib/libvirt/lockd to extend the protection across multiple hosts in a network
* If a shared filesystem (eg NFS) is mounted at /var/lib/libvirt/lockd, the protection extends across multiple hosts in the network
* If configuring locking across multiple hosts it is important to ensure that all disk image paths are globally unique across all hosts sharing the same NFS mount, and that block devices use the stable unique names under /dev/disk/by-path/ and not the unstable /dev/sdNN names
* If configuring locking across multiple hosts it is important to ensure that all disk image paths are globally unique across all hosts sharing the same NFS mount, and that block devices use the stable unique names under /dev/disk/by-path/ and not the unstable /dev/sdNN names


Line 129: Line 158:




[[Category:FeatureReadyForFesco]]
[[Category:FeatureAcceptedF16]]
<!-- When your feature page is completed and ready for review -->
<!-- When your feature page is completed and ready for review -->
<!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler -->
<!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler -->
<!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete-->
<!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete-->
<!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process -->
<!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process -->

Latest revision as of 11:12, 21 September 2011

Virtual Machine Lock Manager

Summary

The virtual machine lock manager is a daemon which will ensure that a virtual machine's disk image cannot be written to by two QEMU/KVM processes at the same time. It provides protection against starting the same virtual machine twice, or adding the same disk to two different virtual machines.

Owner

Current status

  • Targeted release: Fedora 16
  • Last updated: 21-09-2011
  • Percentage of completion: 100%

Detailed Description

Virtual machines running via the QEMU/KVM platform do not currently acquire any kind of lock when starting up. This means it is possible for the same virtual machine to be accidentally started more than once, or for the same disk image to be accidentally added to two different virtual machines. The result of such a mistake is likely to be catastrophic destruction of the virtual machines filesystem.

The virtual machine lock manager is a framework embedded in the libvirtd daemon that allows for pluggable locking mechanisms. The first available plugin introduced in F16, integrates with the 'sanlock' program. This will protect against adding the same disk to two different virtual machines, and protect against libvirtd bugs where it might "forget" about a previously running virtual machine. If the administrator mounts a suitable shared filesystem (eg, NFS) in /var/lib/libvirt/lockd then the lock manager protection will be extended to all hosts shared that filesystem.

Later Fedora releases will introduce alternative lock manager implementations.

Benefit to Fedora

Hosts running virtual machines for QEMU/KVM will have much stronger protection against administrator host/cluster configuration mistakes. This will reduce the risk that a virtual machines' disk image will become corrupted as a result.

Scope

The changes are confined to the libvirt and sanlock packages

- The new 'sanlock' RPM is introduced to Fedora
- The new 'libvirt-locking-sanlock' sub-RPM is introduced to the libvirt.spec file
- The /etc/libvirt/qemu.conf file will gain a configuration parameter to set the lock manager implementation
- A new /etc/libvirt/qemu-sanlock.conf file is introduced for sanlock lock manager configuration

How To Test

There are no special hardware requirements for testing this feature, beyond those already required for running QEMU/KVM virtual machines.

General host setup

Install libvirt, KVM, etc as per normal practice. Additionally install the 'augtool', 'libvirt-lock-sanlock' and 'sanlock' RPMs using yum

The sanlock plugin requires a directory in which it will store leases. For single host protection, this directory can be a local filesystem, but for cross-host protection it needs to be a network filesystem like NFS, or cluster filesystem like GFS. By convention the directory should be '/var/lib/libvirt/sanlock'.

Each host that shares the same filesystem for leases, needs to be allocated a *unique* host ID, between 1 and 512.

With this in mind the basic configuration for sanlock can be done with the following augeas commands:

 $ augtool
 augtool> set /files/etc/libvirt/qemu.conf/lock_manager "sanlock"
 augtool> set /files/etc/libvirt/qemu-sanlock.conf/host_id 1
 augtool> set /files/etc/libvirt/qemu-sanlock.conf/auto_disk_leases 1
 augtool> set /files/etc/libvirt/qemu-sanlock.conf/disk_lease_dir "/var/lib/libvirt/sanlock"
 augtool> save
 Saved 1 file(s)
 augtool> quit

Obviously, change the 'host_id' line to give a unique value for the host.

By default sanlock uses a software watchdog to ensure that the host is automatically hard rebooted if something goes wrong. In testing this is not very nice, so disable the sanlock watchdog and then start the sanlock daemon

 $ echo 'SANLOCKOPTS="-w 0"' > /etc/sysconfig/sanlock
 $ service sanlock start

Single host testing

- Follow the 'General host setup' instructions
- Restart the libvirtd daemon
- Provision two virtual machines
- Create a third disk image  (eg dd if=/dev/zero of=/var/lib/libvirt/images/extra.img bs=1M count=100)
- Add the following XML to the configuration of both virtual machines
     <disk type='file' device='disk'>
       <source file='/var/lib/libvirt/images/extra.img'/>
       <target dev='vdb' bus='virtio'/>
     </disk>
- Start the first virtual machine
- Attempt to start the second virtual machine

The last step should fail, with a message that the disk image is already in use.

 - Stop the first virtual machine
 - Attempt to start the second virtual machine

The second VM should now successfully run


Dual host testing

- Follow the 'General host setup' instructions, on both hosts
- Mount an NFS volume at /var/lib/libvirt/sanlock on both hosts
- Restart the libvirtd daemon on both hosts
- Provision a virtual machine
- Copy the virtual machine configuration to the second host
        virsh dumpxml myguest > myguest.xml
        virsh -c qemu+ssh://otherhost/system define myguest.xml
- Start the virtual machine on the first host
- Attempt to start the virtual machine on the second host

The last step should fail, with a message that the disk image is already in use.

- Stop the virtual machine on the first host
- Attempt to start the virtual machine on the second host

The VM should now succesfully run on the second host

Migration testing

- As per "Dual host testing"
- Attempt to migrate the running VM from the first host to the second host

Libvirtd failure testing

- As per 'Single host testing"
- Start the first virtual machine
- Stop the libvirtd daemon, without stopping the VM
- Delete the files /var/run/libvirt/qemu/myguest.{pid,xml}  (this ophans the VM from libvirtd)
- Start the libvirtd daemon
- Attempt to start the first virtual machine again

The last step should fail, with a message that the disk image is already in use.

- Find the orphaned QEMU process and manually kill it
- Attempt to start the first virtual machine again

The VM should now once again run successfully

User Experience

End users should see no difference in behaviour of QEMU/KVM virtualization during normal operation.

They will be prevented from making certain configuration/operational mistakes which would otherwise result in the same disk image being run twice


Dependencies

The feature is confined to the 'libvirt' package

Contingency Plan

The use of 'sanlock' is an explicit adminsitrator 'opt in', thus no contingency plan is required. The user can simply run without a lock manager, in which case the behaviour will be identical to previous Fedora releases.

Documentation

The primary upstream documentation is at

Release Notes

  • The QEMU/KVM virtualization driver in libvirt includes an optional lock manager plugin to enforce exclusive access to the virtual machine disk images on a single host. This prevents multiple guests being started with the same disk image, unless the <shareable/> flag is set for the disk
  • If a shared filesystem (eg NFS) is mounted at /var/lib/libvirt/lockd, the protection extends across multiple hosts in the network
  • If configuring locking across multiple hosts it is important to ensure that all disk image paths are globally unique across all hosts sharing the same NFS mount, and that block devices use the stable unique names under /dev/disk/by-path/ and not the unstable /dev/sdNN names

Comments and Discussion