Features/CrashHandling

From FedoraProject

< Features
Revision as of 16:58, 1 December 2008 by Wwoods (Talk | contribs)

Jump to: navigation, search

Contents

Handling program crashes in Fedora

Summary

As of about Fedora 6, packages no longer include the "debuginfo" data necessary for local crash handlers to get a useful stack trace. See: http://fedoraproject.org/wiki/Packaging/Debuginfo and http://fedoraproject.org/wiki/StackTraces

What we want is a system that gets information about the crash to developers in a form with complete stack trace data.

The plan has two major parts - client and server.

Client

crash-handler

A program to catch crashing programs and write out a crash report / stack trace.

  • Catching the crash is trivial using the kernel's core pattern piping support, e.g.:
    • echo '|/usr/sbin/crash-handler --pid %p --rlimit %c' > /proc/sys/kernel/core_pattern
  • Write crashes to a (configurable) standard location, such as /var/crash
  • This crash handler should be able to produce Breakpad minidumps

crash-watcher

A small daemon to:

  • watch the crash location for new dumps
  • clean up old/unneeded dumps, based on user preferences (maximum age/disk space/etc.)

When a new dump is found, send notifications to the user allowing them to:

  • Send a report (iff the binary was provided by Fedora)
    • Optional "Always send report automatically" checkbox
  • Ignore further crashes of that program
  • Ignore all further crashes

crash-submitter

Sends minidumps to the server to be retraced. Package-x-generic-16.pngbug-buddy might work for this.

  • Submit report to Socorro server (or similar)
    • Configured to use Fedora server by default, but allow user to set their own server
      • Future work: allow per-package overrides (so GNOME dumps go to GNOME, etc)
  • Save UUID for that report somewhere, as with Package-x-generic-16.pngkerneloops

Server

  • Get a Socorro server running in Fedora's infrastructure
  • Point the default breakpad configuration to it (easy)

Extra

  • Run a separate kerneloops server?

Open questions

  • Do symbol resolution on the client or the server?
  • How to do symbol resolution? FUSE? littlebottom?
  • Tie it to smolt profiles?

Comments

  • Why not use breakpad?
    • We don't want LD_PRELOAD everywhere.

Owner

  • Name: [none currently]

Current status

  • Targeted release:
  • Last modified: Template:Void9 June 2008
  • Percent complete: 0%

Usage cases / rationale

By providing an automated mechanism for tracking application crashes, we will be able to:

  • see bugs earlier, and fix them earlier
  • see what bugs are hit most
  • get usage and crash data from people who are unable or unwilling to interact with bugzilla

Benefit to Fedora

Better crash data, which leads to more crash fixes, which leads to a higher-quality distribution.

Scope

Infrastructure:

  • Requires running a new server in the Fedora infrastructure.

Code:

  • Requires a new crash handling agent
  • Requires packaging the Socorro server

Testing

Cause a program to crash and get a report submitted to Socorro. Test that socorro correctly retraces it and gets enough information for a developer to identify the problem.

Dependencies

  1. Need to package the socorro server

Details

Optional

User Experience

A program crashes. We display a dialog or notification that the program has crashed and save a useful stack trace to a well-known location.

Contingency plan

  1. Don't enable the agent
  2. Don't ship the agent
  3. Reinvestigate other options such as Apport.

Documentation

Some simple documentation on how to enable and disable the crash reporting, and how to make it happen automatically.

Release Notes

We will want to explain to developers of Free programs how to find crash dumps.

Comments