Tools/XenoprofGuide

What's Xenoprof?
OProfile is a system-wide, statistical, low overhead profiler for Linux. Xenoprof is its adaption to Xen.

Xen doesn't virtualize performance counters, because that's too expensive with current hardware. Therefore, you can't just run OProfile in a guest as if it were running on bare metal. Instead, you run Xenoprof in the host (hypervisor & dom0) and guests.

Why this guide?
The current version of Xenoprof is somewhat awkward to use: starting and stopping the profiler involves several steps, and when you misstep, the resulting diagnostics can be rather confusing. Especially when you don't know how Xenoprof works.

This guide attempts to help you understand how Xenoprof works, so that the steps to start and stop it become logical rather than just voodoo. It does not give you a recipe to follow. Not even a worked-out example. Perhaps it should.

How does Xenoprof work?
Like stock OProfile, Xenoprof needs to take samples, write them to disk, and connect samples to programs and ultimately source code.

Sampling is driven by performance counter hardware: the profiler programs performance counters to count certain events, and interrupt after a certain number of them has been counted. The interrupt service routine takes a sample.

Code working with (unvirtualized) performance counters could run in the hypervisor or in dom0. The former is more efficient, because it's closer to the hardware, and that's what Xenoprof does.

Code writing samples to disk can't run in the hypervisor. Only domains can do that.

To interpret profiles, you need to know what the sampled PCs mean. OProfile can do that, but only for the domain it's running in. Xenoprof additionally makes dom0 capable of doing that for guest kernels, but not guest userland.

Knowing the above, the way Xenoprof works becomes pretty obvious:


 * Hypervisor takes samples.


 * Active domains run OProfile. They receive their own samples from the hypervisor, and they know what these samples mean.  Only para-virtual domains can be active.


 * Dom0 is always active, and additionally receives samples for hypervisor and passive domains. Passive domains don't run OProfile. Interpreting of passive domains' samples is very limited: dom0 can interpret samples in Linux kernels, but no more.


 * Domains that are neither active nor passive don't participate in the profile. The hypervisor discards their samples.

How to start and stop Xenoprof
Steps to start Xenoprof:

1. opcontrol --start-daemon in dom0 to set up things in the hypervisor and dom0.

2. opcontrol --start in all active domU (requires hypervisor to be set up).

3. opcontrol --start in dom0 (requires active domU to be started); this starts sampling.

Steps to stop Xenoprof:

1. opcontrol --stop in dom0 stops sampling.

2. opcontrol --shutdown in all active domains.

Resources

 * OProfile home http://oprofile.sourceforge.net
 * OProfile tutorials and more http://people.redhat.com/wcohen/
 * Xenoprof home http://xenoprof.sourceforge.net
 * Xenoprof tutorial http://xen.xensource.com/files/summit_3/xenoprof_tutorial.pdf