How to use kdump to debug kernel crashes

From FedoraProject

(Difference between revisions)
Jump to: navigation, search
m (1 revision(s))
(Step 1: Configuring Kdump)
(22 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
= Kernel and kdump =
 
= Kernel and kdump =
  
Kdump is a new kernel crash dumping mechanism and is very reliable because
+
Kdump is a kernel crash dumping mechanism and is very reliable because the
 
crash dump is captured from the context of a freshly booted kernel and not
 
crash dump is captured from the context of a freshly booted kernel and not
 
from the context of the crashed kernel. Kdump uses kexec to boot into
 
from the context of the crashed kernel. Kdump uses kexec to boot into
 
a second kernel whenever system crashes. This second kernel, often called
 
a second kernel whenever system crashes. This second kernel, often called
capture kenrel, boots with very little memory and captures the dump image.
+
the crash kernel, boots with very little memory and captures the dump image.
  
 
The first kernel reserves a section of memory that the second kernel uses
 
The first kernel reserves a section of memory that the second kernel uses
to boot. Kexec enables booting the capture kernel without going through
+
to boot. Kexec enables booting the capture kernel without going through the
BIOS hence contents of first kernel's memory are preserved, which is
+
BIOS, so contents of the first kernel's memory are preserved, which is
 
essentially the kernel crash dump.
 
essentially the kernel crash dump.
  
How to Use Kdump (i386)
+
== How to Use Kdump ==
----------------------
+
  
Step 1: Configuring Kdump
+
=== Step 1: Configuring Kdump ===
-------------------------
+
- Install latest Fedora Core 5 Test 3 from http://torrent.fedoraproject.org/
+
or from http://download.fedora.redhat.com/pub/fedora/linux/core/test/4.92/
+
  
- Install "kernel-kdump", "kexec-tools" and "kernel-debuginfo" packages. Use
+
# First, install the kexec-tools, crash and kernel-debuginfo packages. Use following command line to install the packages.
following command line to install the packages.
+
#: <pre>yum install --enablerepo=fedora-debuginfo --enablerepo=updates-debuginfo kexec-tools crash kernel-debuginfo</pre>
 +
#:
 +
#: NOTE: The crash and kernel-debuginfo packages are only required if you are planning on looking at the resulting kernel vmcore yourself.  Most often this is the case, however if you are setting up kdump on a machine simply to capture a vmcore that will be analyzed by someone else or on a different machine, you can skip those packages.
 +
# Next, edit /boot/grub/grub.conf or /boot/grub2/grub.cfg and add the "crashkernel=128M" command line option.  An example command line might look like this (for grub2, "kernel" is replaced by "linux"):
 +
#: <pre>kernel /vmlinuz-3.1.4-1.fc16.x86_64 ro root=/dev/VolGroup00/LogVol00 rhgb LANG=en_US.UTF-8 crashkernel=128M</pre>
 +
# Next, consider editing the kdump configuration file {{filename|/etc/kdump.conf}}.  This will allow you to write the dump over the network or to some other location on the local system, rather than to the default location of /var/crash.  For additional information, consult the mkdumprd man page and the comments in /etc/kdump.conf.
 +
# Next, reboot your system
 +
# Finally, active the kdump system service
 +
#: <pre>systemctl start kdump.service</pre>
  
"yum install kernel-kdump kexec-tools kernel-debuginfo"
+
Considerations:
  
- Boot first kernel with additional command line option "crashkernel=64M@16M".
+
# Above shown parameter reserves 128MB of physical memory. This reserved memory is used to preload and run the capture kernel.
Edit /boot/grub/menu.lst and add "crashkernel=64M@16M" command line option.
+
# Init scripts take care of pre-loading the capture kernel at system boot time.
An example command line might look like as follows.
+
# It is recommended to either set up a serial console or switch to run level 3 (init 3) for testing purposes. The reason being that kdump does not reset the console if you are in X or  framebuffer mode, and no message might be visible on console after system crash. You may also see screen corruption in graphics mode during capture.
 +
# Capturing a crash dump can take a long time, especially if the system has a lot of memory. Be patient. The system will reboot after the dump is captured.
  
"kernel /vmlinuz-2.6.15-1.1955_FC5smp ro root=/dev/VolGroup00/LogVol00 rhgb console=tty0 console=ttyS0,115200 crashkernel=64M@16M"
+
=== Step 2: Capturing the Dump ===
  
 
Notes:
 
1. Above shown parameter reserves 64MB of physical memory starting
 
at 16MB. This reserved memory is used to preload and run the
 
capture kernel.
 
 
2. Init scripts take care of pre-loading the capture kernel at
 
the system bootup time.
 
 
2. It is recommended to either setup a serial console or switch to
 
run level 3 (init 3) for testing purposes. The reason being that
 
kdump does not reset the console if you are in X or framebuffer
 
mode, and no message might be visible on console after system
 
crash.
 
 
Step 2: Capturing the Dump
 
-------------------------
 
Normally kernel panic() will trigger booting into capture kernel but for
 
testing purposes one can simulate the trigger in one of the following
 
ways.
 
 
- Trigger through /proc interface
 
 
 
- Trigger by inserting a module which calls panic().
 
 
System will boot into capture kernel. Dump will be automatically saved in
 
/var/crash/<dumpdir> and system will boot back into regular kernel.
 
 
 
Step 3: Dump Analysis
 
---------------------
 
- Open the vmcore using crash tool.
 
 
 
Note: <vmcore-dir> will be created under /var/crash depending on date and time
 
of crash. For example, /var/crash/2006-02-17-17:02/vmcore.
 
 
More Documentation:
 
------------------
 
- Kernel Source (Documentation/kdump/kdump.txt).
 
- http://lse.sourceforge.net/kdump/
 
 
 
 
Kdump Setup (x86_64)
 
-------------------
 
 
Step 1: Configuring Kdump
 
-------------------------
 
- Install latest Fedora Core 5 Test 3 from http://torrent.fedoraproject.org/
 
or from http://download.fedora.redhat.com/pub/fedora/linux/core/test/4.92/
 
 
- Install "kernel-kdump", "kexec-tools" and "kernel-debuginfo" packages. Use
 
following command line to install the packages.
 
 
"yum install kernel-kdump kexec-tools kernel-debuginfo"
 
 
- Boot first kernel with additional command line option "crashkernel=64M@16M".
 
Edit /boot/grub/menu.lst and add "crashkernel=64M@16M" command line option.
 
An example command line might look like as follows.
 
 
"kernel /vmlinuz-2.6.15-1.1955_FC5smp ro root=/dev/VolGroup00/LogVol00 rhgb console=tty0 console=ttyS0,115200 crashkernel=64M@16M"
 
 
 
Notes:
 
1. Above shown parameter reserves 64MB of physical memory starting
 
at 16MB. This reserved memory is used to preload and run the
 
capture kernel.
 
 
2. Init scripts take care of pre-loading the capture kernel at
 
the system bootup time.
 
 
2. It is recommended to either setup a serial console or switch to
 
run level 3 (init 3) for testing purposes. The reason being that
 
kdump does not reset the console if you are in X or framebuffer
 
mode, and no message might be visible on console after system
 
crash.
 
 
Step 2: Capturing the Dump
 
-------------------------
 
 
Normally kernel panic() will trigger booting into capture kernel but for
 
Normally kernel panic() will trigger booting into capture kernel but for
 
testing purposes one can simulate the trigger in one of the following
 
testing purposes one can simulate the trigger in one of the following
 
ways.
 
ways.
  
- Trigger through /proc interface
+
# Trigger through /proc interface  
 +
#: <pre>echo c > /proc/sysrq-trigger</pre>
 +
# Trigger by inserting a module which calls panic().
  
 +
The system will boot into the capture kernel.  A kernel dump will be automatically saved in <code>/var/crash/<dumpdir></code> and the system will boot back into the regular kernel.  The name of the dump directory will depend on date and time of crash. For example, <code>/var/crash/2006-02-17-17:02/vmcore</code>.
  
- Trigger by inserting a module which calls panic().
+
=== Step 3: Dump Analysis ===
  
System will boot into capture kernel. Dump will be automatically saved in
+
Once the system has returned from recovering the crash, you may wish to analyse the kernel dump file using the <code>crash</code> tool.
/var/crash/<dumpdir> and system will boot back into regular kernel.
+
  
 +
# First, locate the recent vmcore dump file:
 +
#: <pre>find /var/crash -type f -mtime -1</pre>
 +
# One you have located a vmcore dump file, call <code>crash</code>:
 +
#: <pre>crash /var/crash/2009-07-17-10\:36/vmcore /usr/lib/debug/lib/modules/`uname -r`/vmlinux</pre>
  
Step 3: Dump Analysis
+
{{admon/note|Missing debuginfo?|Cannot find any files under <code>/usr/lib/debug</code>?  Make sure you have the ''kernel-debuginfo'' package installed.}}
---------------------
+
- Existing "crash" might not be new enough to open the crash dump. Download
+
latest crash source from http://people.redhat.com/~anderson
+
Build and install the "crash".
+
  
- Open the vmcore using crash tool.
+
For more information on using the <code>crash</code> tool, see [[#More Documentation]].
  
 +
== More Documentation ==
  
Note: <vmcore-dir> will be created under /var/crash depending on date and time
+
* Kernel Source (Documentation/kdump/kdump.txt).
of crash. For example, /var/crash/2006-02-17-17:02/vmcore.
+
* http://lse.sourceforge.net/kdump/
 +
* Using crash - http://people.redhat.com/anderson
  
More Documentation:
+
[[Category:Debugging|K]]
------------------
+
- Kernel Source (Documentation/kdump/kdump.txt).
+
- http://lse.sourceforge.net/kdump/
+

Revision as of 15:37, 15 February 2012

Contents

Kernel and kdump

Kdump is a kernel crash dumping mechanism and is very reliable because the crash dump is captured from the context of a freshly booted kernel and not from the context of the crashed kernel. Kdump uses kexec to boot into a second kernel whenever system crashes. This second kernel, often called the crash kernel, boots with very little memory and captures the dump image.

The first kernel reserves a section of memory that the second kernel uses to boot. Kexec enables booting the capture kernel without going through the BIOS, so contents of the first kernel's memory are preserved, which is essentially the kernel crash dump.

How to Use Kdump

Step 1: Configuring Kdump

  1. First, install the kexec-tools, crash and kernel-debuginfo packages. Use following command line to install the packages.
    yum install --enablerepo=fedora-debuginfo --enablerepo=updates-debuginfo kexec-tools crash kernel-debuginfo
    NOTE: The crash and kernel-debuginfo packages are only required if you are planning on looking at the resulting kernel vmcore yourself. Most often this is the case, however if you are setting up kdump on a machine simply to capture a vmcore that will be analyzed by someone else or on a different machine, you can skip those packages.
  2. Next, edit /boot/grub/grub.conf or /boot/grub2/grub.cfg and add the "crashkernel=128M" command line option. An example command line might look like this (for grub2, "kernel" is replaced by "linux"):
    kernel /vmlinuz-3.1.4-1.fc16.x86_64 ro root=/dev/VolGroup00/LogVol00 rhgb LANG=en_US.UTF-8 crashkernel=128M
  3. Next, consider editing the kdump configuration file /etc/kdump.conf. This will allow you to write the dump over the network or to some other location on the local system, rather than to the default location of /var/crash. For additional information, consult the mkdumprd man page and the comments in /etc/kdump.conf.
  4. Next, reboot your system
  5. Finally, active the kdump system service
    systemctl start kdump.service

Considerations:

  1. Above shown parameter reserves 128MB of physical memory. This reserved memory is used to preload and run the capture kernel.
  2. Init scripts take care of pre-loading the capture kernel at system boot time.
  3. It is recommended to either set up a serial console or switch to run level 3 (init 3) for testing purposes. The reason being that kdump does not reset the console if you are in X or framebuffer mode, and no message might be visible on console after system crash. You may also see screen corruption in graphics mode during capture.
  4. Capturing a crash dump can take a long time, especially if the system has a lot of memory. Be patient. The system will reboot after the dump is captured.

Step 2: Capturing the Dump

Normally kernel panic() will trigger booting into capture kernel but for testing purposes one can simulate the trigger in one of the following ways.

  1. Trigger through /proc interface
    echo c > /proc/sysrq-trigger
  2. Trigger by inserting a module which calls panic().

The system will boot into the capture kernel. A kernel dump will be automatically saved in /var/crash/<dumpdir> and the system will boot back into the regular kernel. The name of the dump directory will depend on date and time of crash. For example, /var/crash/2006-02-17-17:02/vmcore.

Step 3: Dump Analysis

Once the system has returned from recovering the crash, you may wish to analyse the kernel dump file using the crash tool.

  1. First, locate the recent vmcore dump file:
    find /var/crash -type f -mtime -1
  2. One you have located a vmcore dump file, call crash:
    crash /var/crash/2009-07-17-10\:36/vmcore /usr/lib/debug/lib/modules/`uname -r`/vmlinux
Note.png
Missing debuginfo?
Cannot find any files under /usr/lib/debug? Make sure you have the kernel-debuginfo package installed.

For more information on using the crash tool, see #More Documentation.

More Documentation