How to debug Virtualization problems

From FedoraProject

Revision as of 21:26, 12 July 2010 by Crobinso (Talk | contribs)

Jump to: navigation, search

Contents

Effective bug reporting

Reporting bugs effectively is an important skill for any Fedora user or developer.

Narrowing down the possible causes of the bug and providing the right information in the bug report allows a bug to be resolved quickly. Filing a bug report with little useful information can mean that your bug lays unresolved, possibly until it is closed automatically when the distribution version reaches "end of life".

See BugsAndFeatureRequests and "how to file a bug report" for generic information on filing bugs. This page contains information specific to virtualization bugs.

Note: if you're filing a virtualization related bug against a package which isn't on this list, then please cc the fedora-virt-maint@redhat.com alias in bugzilla to ensure virt developers see the bug.

Version Information

Once you've ensured you have the latest updates installed for the relevant packages, gather details of the version numbers of those packages e.g.

$> rpm -q qemu-kvm qemu-common python-virtinst virt-viewer virt-manager

To find out what kernel version you are currently running, and what machine architecture you're using:

$> uname -a

Of course, you should also make sure to file the bug using the appropriate version of Fedora. Rawhide users should file bugs using the "rawhide" version.

Hardware Information

Fedora's virtualization capabilities rely heavily on hardware capabilities, so when filing bugs please include copious information on your hardware platform including:

$> cat /proc/cpuinfo
$> lspci -vvv

You can also check what virtualization capabilities are available on your machine by running:

$> virsh capabilities

Guest Configuration

When filing a bug related to problems seen in the guest, include full details on the guest configuration including CPU architecture, RAM size, devices etc. This is most easily done by including the output of virsh dumpxml MyGuest or, in the case of qemu, the full qemu command line.

Virt Manager

Virt Manager stores a logfile in ~/.virt-manager/virt-manager.log.

Examine the log file and include any pieces that look like they might be useful in the bug report. If in doubt, attach the whole file to the bug.

You can also run virt-manager from the command line using virt-manager --no-fork and check whether any relevant messages were printed there.

virt-install

virt-install stores a log file in ~/.virtinst/virt-install.log.

Run virt-install using the --debug option to get detailed debug spew.

In order to gain access to a serial console during the install, you can use -x "console=ttyS0". Using a serial console combined with a VNC install can be very useful for debugging e.g. --nographics -x "console=ttyS0 vnc"

libvirt

Any program using libvirt can be debugged using the LIBVIRT_DEBUG=1 environment variable e.g.

$> LIBVIRT_DEBUG=1 virt-manager --no-fork
$> LIBVIRT_DEBUG=1 virsh list --all

If your issue looks like it might be related to libvirtd try looking in /var/log/messages for any error messages.

You can also use /etc/libvirt/libvirtd.conf logging configuration to e.g. log debug spew to a file:

log_level = 1
log_outputs = 0:file:/tmp/libvirtd.log

Alternatively, you could try running libvirtd from the command line with debugging options enabled:

$> service libvirtd stop
$> LIBVIRT_DEBUG=1 libvirtd --verbose

Networking

If you are having trouble with guests connected to a libvirt virtual network, shared physical interface or bridge, try these commands:

$> virsh net-list --all
$> brctl show
$> sysctl net.bridge.bridge-nf-call-iptables
$> iptables -L -v -n
$> ps -ef | grep dnsmasq
$> ifconfig -a
$> cat /proc/sys/net/ipv4/ip_forward
$> service libvirtd reload

If you find that /proc/sys/net/ipv4/ip_forward is not being set to 1 at boot time, try looking at the ordering of the libvirtd and NetworkManager services:

$> find /etc/rc.d -regex '.*rc[35].d/S.*\(libvirtd\|NetworkManager\)'
$> rm -f /etc/chkconfig.d/libvirtd /etc/chkconfig.d/NetworkManager
$> chkconfig libvirtd resetpriorities
$> chkconfig NetworkManager resetpriorities
$> find /etc/rc.d -regex '.*rc[35].d/S.*\(libvirtd\|NetworkManager\)'

kvm

See also the KVM wiki page on reporting bugs.

The output of any qemu-kvm command run by libvirtd is stored in /var/log/libvirt/qemu/GuestName.log.

kvm-autotest is an excellent way of testing basic KVM functionality.

xen

If a guest is crashing you can obtain a stack trace by doing the following:

  • Set "on_crash=preserve" in your domain config
  • Copy the guest kernel's System.map to the host
  • Once the guest has crashed, run /usr/lib/xen/bin/xenctx -s System.map <domid>

General Tips

System Log Files

Always look in dmesg, /var/log/messages etc. for any useful information.

strace

strace can often shed light on a bug - e.g. if you run virt-manager, or libvirtd or qemu-kvm under strace you can see what files they accessed, what commands they executed, what system calls they invoked etc.:

$> strace -ttt -f libvirtd

If the program in question is already running, you can attach to it using strace -p.

gdb

gdb can often be useful to trace the execution of a program. However, in order to get useable information, you will need to install "debuginfo" packages. See the StackTraces page for more information.

SELinux

If you see "AVC denied" or "setroubleshoot" messages in /var/log/messages, your bug might be caused by an SELinux policy issue. Try temporarily putting SELinux into "permissive" mode with:

$> setenforce 0

If this makes your bug go away that doesn't mean your bug is fixed, it just narrows down the cause! You should include the AVC details from ausearch -m AVC -ts recent in the bug report, or if the message includes a sealert -l command then include the details printed by the command.

One common cause of SELinux problems is mis-labelled files. Try:

$> restorecon /path/to/file/in/selinux/message

If you are installing using an ISO on an NFS mount, you need to ensure that it is mounted using the virt_content_t label:

$> mount -o context="system_u:object_r:virt_content_t:s0" ...

If you are using libvirt storage pools, like nfs, or USB pass-through, you might want to check, or toggle one of the following SELinux booleans: virt_use_comm, virt_use_fusefs, virt_use_nfs, virt_use_samba, virt_use_usb.

$> getsebool virt_use_nfs
virt_use_nfs --> off
$> setsebool -P virt_use_nfs on

Troubleshooting

Permission issues

Prior to Fedora 11/libvirt 0.6.1, all virtual machines run through libvirt were run as root, giving full administrator capabilities. While this simplified VM management, it was not very security conscious: a compromised virtual machine could possibly have administrator privileges on the host machine.

In Fedora 11/libvirt-0.6.1, security started to improve with the addition of svirt. In a nutshell, libvirt attempts to automatically apply selinux labels to every file a VM needs to use, like disk images. If a VM tries to open a file that libvirt didn't label, permission will be denied.

Fedora 12 saw things improve even more. As of libvirt-0.6.5, VMs were now launched with reduced process capabilities. This prevented the VM from doing things like altering host network configuration (something it shouldn't typically need to do). And as of libvirt-0.7.0, the VM emulator process was no longer run as 'root' by default, instead being run as an unprivleged 'qemu' user.

While all these changes are great for security, they broke previously working setups which depended on the relaxed VM permissions. Most issues have work arounds that come at the expense of security. Over time, many of these issues should be made to 'just work', but we aren't there yet.

Changing the QEMU/KVM process user

Warning (medium size).png
Changing the QEMU/KVM process user has security implications.
To change the user that libvirt will run the QEMU/KVM process as, edit /etc/libvirt/qemu.conf and uncomment and change the user= and group= fields. For example, if wanting to run KVM as the user 'foobar', you would set the fields to
...
user='foobar'
group='foobar'
...
Then restart libvirtd with
service libvirtd restart

Changing SVirt/Selinux configuration

Warning (medium size).png
Changing the SVirt/SELinux settings may have security implications.
SVirt can be disabled for the libvirt QEMU driver by editting /etc/libvirt/qemu.conf, uncommenting and setting
security_driver='none'
Then restart libvirtd with
service libvirtd restart

Changing QEMU/KVM process capabilities

Warning (medium size).png
Changing the this setting has security implications.
Libvirt by default launches QEMU/KVM guests with reduced process capabilities. To disable this feature, edit /etc/libvirt/qemu.conf, uncomment and set
clear_emulator_capabilities=0
Then restart libvirtd with
service libvirtd restart

KVM performance issues

Often times, VM slowness is caused because the VM is using plain QEMU and not KVM.

Ensuring system is KVM capable

Verify that the KVM kernel modules are properly loaded:

$ lsmod | grep kvm
kvm
kvm_intel

If that command did not list kvm_intel or kvm_amd, KVM is not properly configured. See this KVM wiki page to ensure your hardware supports virtualization extensions. If it doesn't, you cannot use KVM acceleration, only plain QEMU is an option.

If your hardware does support virtualization extensions, try to reload the kernel modules with:

su -c 'bash /etc/sysconfig/modules/kvm.modules'

Retry the above lsmod command and see if you get the desired output. If not, or if the kvm.modules command produces an error, check the output of:

dmesg | grep -i kvm

If you see 'KVM: disabled by BIOS', please see the relevant KVM wiki page Any other error message is probably a bug, and should be reported.

If all that works out fine, you want to make your that your VMs are actually using KVM

Is My Guest Using KVM?

Often people are unsure whether their qemu guest is actually using hardware virtualization via KVM.

Firstly, check that libvirt thinks KVM is available:

 $> virsh capabilities  | grep kvm
     <domain type='kvm'>
       <emulator>/usr/bin/qemu-kvm</emulator>

and that the guest is configured to use KVM:

  $> virsh dumpxml ${guest} | grep kvm
  <domain type='kvm' id='18'>
      <emulator>/usr/bin/qemu-kvm</emulator>

If that does not return anything, you want to make <domain type='kvm'> and <emulator>/usr/bin/qemu-kvm</emulator>, using the command:

virsh edit ${guest}

Next, look in /var/log/libvirt/qemu/${guest}.log to check that /usr/bin/qemu-kvm is the emulator that was executed by libvirt and that there are no error messages about /dev/kvm.

If you want to get really funky, you can check whether qemu-kvm has /dev/kvm open:

  $> for iii in /proc/$(ps h -o tid -C qemu-kvm)/fd/*; do readlink $iii; done | grep kvm
  anon_inode:kvm-vcpu
  /dev/kvm
  anon_inode:kvm-vm

Serial console access for troubleshooting and management

Serial console access is useful for debugging kernel crashes and remote management can be very helpful.

Fully-virtualized guest OS will automatically have a serial console configured, but the guest kernel will not be configured to use this out of the box. To enable the guest console in a Linux fully-virt guest, edit the /etc/grub.conf in the guest and add 'console=ttyS0 console=tty0'. This ensures that all kernel messages get sent to the serial console, and the regular graphical console. The serial console can then be access in same way as paravirt guests:

su -c "virsh console <domain name>"

Alternatively, the graphical virt-manager program can display the serial console. Simply display the 'console' or 'details' window for the guest & select 'View -> Serial console' from the menu bar. virt-manager may need to be run as root to have sufficient privileges to access the serial console.

Graphical console access

In order to get a graphical console on your guest you can either use 'virt-manager' and select the console icon for the guest, or you can use the 'virt-viewer' tool to just directly connect to the console:

virt-viewer guestname 

Accessing data on guest disk images

Stop (medium size).png
Remember never to do this while the guest is up and running, as it could corrupt the filesystem

The 'guestfish' package allows you to use a simple shell interface to manipulate guest disk images without needing to run the guest.

su -c 'yum install guestfish'

See 'man guestfish' and guestfish recipes for information and some common recipes. guestfish can also be scripted to change a group of guest disk images in a row.

Known issues

Audio output

Audio has always been difficult to get working with libvirt, but the recent security changes have actually provided the mechanisms to make it work. The primary problem is that the VM is not sending sound output to your user's pulseaudio session. There may be a pulseaudio option to work around this issue, but I've managed to make it work with:

This will eventually be solved out of the box by having the VNC graphical client receive audio from the VM and play it as the current user. Some code exists to handle this for virt-viewer/virt-manager, but it isn't 100% complete yet. For more info, see bug 595880, SDL audio bug 536692, VNC audio bug 508317

SDL Graphics

QEMU needs access to your $XAUTHORITY file in order to use SDL graphics.

  • Configure SDL graphics for your VM. Easiest way to do this:
$> echo <graphics type='sdl display='$DISPLAY' xauth='$XAUTHORITY'/>
<graphics type='sdl display=':0.0' xauth='/home/cole/.Xauthority'/>
(copy that string)
$> su -c 'virsh edit $vmname'
(stick that string somewhere in the <devices> block, remove any other <graphics> devices)
  • Give VM user access to your $XAUTHORITY file. The default VM user in Fedora 12+ is 'qemu', so you can provide read access with
    setfacl -m u:qemu:r $XAUTHORITY
    If you get an 'operation not supported' error, you can optionally provide less discerning read access with
    chmod +r $XAUTHORITY
    Beware, this probably has security implications. If that does not work, you can optionally change the VM user to either root (behavior of older Fedora versions), or to your own regular user

Errors using <interface type='ethernet'/>

Libvirt's default behavior of dropping QEMU/KVM process capabilities prevents <interface type='ethernet'/> from working correctly. You can try:

If that isn't sufficient, you may want to try the following:

PCI device assignment

Libvirt's default behavior of dropping QEMU/KVM process capabilities prevents PCI device assignment from working correctly. See 573850 for more info. I only managed to get this working with the following steps: