KernelDebugStrategy

From FedoraProject

Revision as of 18:10, 12 November 2012 by Kparal (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Kernel Debugging Strategy

Fedora enables differing amounts of debugging in the kernel at various times depending on where we are in the release cycle. During development, the rawhide/branched kernel typically always have the debug options enabled. The only exception to this, is that there is a non-debug build done for every -rc rebase. This build is not guaranteed to make it into the repo, as it may be obsoleted quickly by a subsequent debug build. (You can still get the non-debug build from koji).

There are a couple of ways to tell which is a debug build and which isn't before you install a kernel. The first is to look at the RPM changelog. If it says something like:

* Mon Sep 17 2012 Josh Boyer <jwboyer@redhat.com> - 3.6.0-0.rc6.git0.1
- Linux v3.6-rc6
- Disable debugging options.

then it is a non-debug kernel. If it says something like:

* Mon Sep 17 2012 Josh Boyer <jwboyer@redhat.com> - 3.6.0-0.rc6.git0.2
- Reenable debugging options.

then the build will be a debug kernel. The other way to tell is by looking in koji at the produced RPMs. If the build contains a kernel-debug-<version> package, the default kernel-<version> package is a non-debug kernel.

If you want to know if your currently running kernel is a debug kernel or not, you can use the RPM changelog method still or you can look at the matching config file found in /boot. If it has CONFIG_DEBUG_SPINLOCK=y then it is a debug kernel. E.g.:

[jwboyer@vader kernel]$ grep DEBUG_SPINLOCK /boot/config-3.6.0-0.rc4.git2.1.fc18.x86_64 
CONFIG_DEBUG_SPINLOCK=y
[jwboyer@vader kernel]$ 

is debug, whereas:

[jwboyer@zod linux-2.6]$ grep DEBUG_SPINLOCK /boot/config-3.5.3-1.fc17.x86_64 
# CONFIG_DEBUG_SPINLOCK is not set
[jwboyer@zod linux-2.6]$

is not. The various DEBUG_ options can change depending on the kernel version, but DEBUG_SPINLOCK is a fairly good one to judge from. It will only be set on debug kernels.

Alpha ships with the default kernel being a debug kernel. At Beta we run 'make release' which creates the separate kernel-debug package in the build. That remains the case for that branch going forward until the release EOL.

If you want to use a non-debug kernel during early Fedora Branched stages (pre-Alpha), you can try to add a special Rawhide non-debug repository, see RawhideKernelNodebug. It's a Rawhide repository, so it might not work perfectly with Fedora Branched, but there's a very high chance it will. After Fedora Branched switches to non-debug kernel by default (pre-Beta), just turn off this repository.

What the various options do.

  • DEBUG_LIST (debug linked list insertion/deletions)
  • SPINLOCK_SLEEP (check if we're in code where we can sleep before using locks)
  • DEBUG_SHIRQ (cause an interrupt to be generated as soon as we register an IRQ)
  • DEBUG_RODATA (write protect read-only data, cause a pagefault if something tries to write to it)
  • SLUB_DEBUG (perform a number of checks on allocated objects, poison free'd objects)
  • DEBUG_HIGHMEM (Allow debugging of highmem issues on non-highmem boxes)
  • DEBUG_MUTEXES DEBUG_RT_MUTEXES DEBUG_LOCK_ALLOC PROVE_LOCKING DEBUG_SPINLOCK (lock dependancy checker)
  • DEBUG_VM (Various runtime checks in the VM code)

There are also a number of other DEBUG options, which just add extra printk's, or extra information in /proc or /sys, these are mostly uninteresting, and have little to no performance impact.

Occasionally the Fedora kernel team may enable CONFIG_DEBUG_PAGEALLOC (after freeing an object, unmap it from the address space. Attempts to access it cause an oops). This option causes extreme performance loss, so is only enabled in rare cases when trying to track down bugs that have insufficient debugging data.

Release

'kernel' and 'kernel-PAE' are deemed 'performance' kernels, and hence have no debugging options enabled which impact performance. Several low impact options remain enabled, such as

  • DEBUG_LIST
  • SPINLOCK_SLEEP
  • DEBUG_SHIRQ
  • DEBUG_RODATA.
  • DEBUG_VM.

In addition, SLUB_DEBUG is enabled, but by default is inactive. You need to boot with slub_debug=1 to make it perform its usual checks.

'kernel-debug' and 'kernel-PAE-debug' also enable CONFIG_SLUB_DEBUG_ON, which means you don't need to boot with slub_debug=, instead it's always on. (It can be disabled with slub_debug=-) The numerous lock dependancy checker options are enabled. Most of the cost of this option is due to the size of spinlock/mutex structures increasing. If embedded into other structures, these can blow up considerably. For performance critical structures like page struct (which normally fits in a cacheline), this can be expensive.

Finally, the -debug kernels enable a bunch of fault-injection test modules.

Rawhide

For the most part, the same as 'kernel-debug'. Main differences include

  • CONFIG_DEBUG_IGNORE_QUIET is enabled, which makes the 'quiet' boot parameter ineffective. This is done to ease debugging.
  • For the first alpha releases, DEBUG_PAGEALLOC is likely to be set, which is incredibly performance taxing. It's also known to cause problems on some virtual machines with buggy pagefault handlers.
  • Later alpha/beta releases disable PAGEALLOC in favour of relying on SLUB_DEBUG to catch similar bugs. If obscure hard-to-debug issues occur later in the development cycle, PAGEALLOC may be re-enabled temporarily.