User:Kklic/Retrace Server

= Retrace Server =

Summary
Retrace server allows ABRT users to get better backtraces from their crashes by retracing coredumps remotely, on a server owned by Fedora Project.

Owner

 * Name: Michal Toman


 * Email: mtoman@redhat.com

Co-workers

 * The ABRT team
 * E-mail: crash-catcher@lists.fedoraproject.org
 * Name: Karel Klic
 * Name: Jiri Moskovcak

Current Status

 * Targeted release: Fedora 15
 * Last updated: 2011-01-12
 * Percentage of completion: 80%

Detailed description
When ABRT generates a backtrace from a coredump, it needs debuginfo data to be available for the binary and all libraries involved in the crash. Debuginfo packages require a lot of storage space, and sometimes they are not available at all -- package update causes removal of older update of the same package and it's debuginfo.

Other problem is that GDB (which generates the backtrace from the coredump) needs data from the binary and libraries that were involved in the crash. If user updated some relevant package between observing the crash and reporting it, he might be unable to generate good quality backtrace because of updates. This happens often, because Fedora is updated frequently.

Retrace server is one possibility how to solve these issues. ABRT offers user to upload her/his coredump to a remote server, and the retracing step happens there. The server creates an environment identical with what was on user's computer at the time of the crash, by installing all the required packages and their debuginfo. The retrace server is able to do that because it keeps all the older packages from updates, and relevant part of updates-testing locally on the server.

After creating the backtrace, only the submitter is allowed to download/view it.

Benefit to Fedora
Users:
 * 1) Less disk space and processing time needed to use ABRT to report crashes.
 * 2) Possibility to report older crashes.
 * 3) Lower chance of failure, less time spent on crashes which cannot be retraced because debuginfo is no longer available.

Developers:
 * 1) Higher quality of ABRT reports.
 * 2) Possibility to quickly get a backtrace from any/random Fedora coredump.

Scope
ABRT is extended to support Retrace Server.

Retrace server is installed on https://retrace01.fedoraproject.org.

The server implementation consists of three parts:
 * HTTP Interface: Receives the archive with a coredump from user, unpacks all files, puts new task into the queue.
 * Analyzer: Takes a task from queue, creates a virtual root with all required packages and debuginfos installed and runs GDB to create the backtrace.
 * Repository Synchronizer: Downloads packages to a local repository containing all versions of all packages (no removal of older updates).

How To Test
Testing will be possible using a new version ABRT with the retrace server support, which will land in Rawhide when stabilised.

At the moment, testing is possible using the upload.py script from the retrace branch of ABRT Git repository. The script is only designed for test purposes and does not handle all possible errors.

Usage:

The script upload.py takes two arguments:

First (mandatory) - ABRT crash directory, by default found in the /var/spool/abrt/ directory.

Second (optional) - Retrace server address (not important at the moment, there is only one testing machine running retrace server - retrace01.fedoraproject.org).

Running upload.py with different number of arguments will display short help message.

retrace01.fedoraproject.org machine should handle Fedora 13 x86_64, Fedora 14 i686 and Fedora 14 x86_64 crashes.

Script's output includes raw HTTP response containig X-Task-Id and X-Task-Password headers. You may ask the retrace server about three things using these headers: wget -S --no-check-certificate -O /dev/null --header="X-Task-Password: " "https://retrace01.fedoraproject.org//" wget -S --no-check-certificate --header="X-Task-Password: " "https://retrace01.fedoraproject.org//log" wget -S --no-check-certificate --header="X-Task-Password: " "https://retrace01.fedoraproject.org//backtrace"
 * Status - HTTP response contains X-Task-Status header with one of three values 'PENDING', 'FINISHED_SUCCESS', 'FINISHED_FAILURE'. wget may be used to show the output (the HTTP response body is the same as X-Task-Status header's value):
 * Log - Afrer retrace is finished (FINISHED_SUCCESS or FINISHED_FAILURE) the log is available. wget may be used to download the log:
 * Backtrace - Afrer successful retrace (FINISHED_SUCCESS) the backtrace is available. wget may be used to download the backtrace:

Each Task-Password is for single use. After every status / log / backtrace request, the response contains X-Task-Password header with new Task-Password.

User Experience
ABRT reporting wizard offers a possibility to use retrace server instead of local retracing to generate a backtrace. Local retracing is still the default action.

Dependencies

 * new version of ABRT, supporting the retrace server option
 * a server with the retrace server application up and running

Server code dependencies:
 * mock
 * xz
 * mod_wsgi
 * python-webob

Contingency Plan
Hide the possibility of using the retrace server from users in ABRT's graphical user interface.

Documentation

 * Design document
 * Retrace server wiki page on ABRT wiki

Release Notes
ABRT, a crash reporting tool in Fedora, now allows to prepare a part of crash processing remotely, on a server owned by Fedora Project. Remote coredump retracing leads to better quality of reports. Retrace server can generate good backtraces with much higher success rate than local retracing.

Comments and Discussion

 * See Talk:Features/RetraceServer