Easier Python Debugging
Summary
Owner
- Name: David Malcolm
- Email: <dmalcolm@redhat.com>
Current status
- Targeted release: Fedora 41
- Last updated: (DATE)
- Percentage of completion: XX%
Detailed Description
We ship Python wrappers for numerous libraries implemented in C and C++. Bugs in those libraries and in the usage of those libraries can lead to complicated backtraces from gdb, and it can be hard to figure out what's going on at the python level.
For example, see this complex backtrace (relating to bug 536786).
Walking through the stack frames, going up from the bottom (textually), or down from the top (numerically):
- frames 26 and below show a pygtk application starting up.
- An event comes in frame 24/25, and is dispatched into pulsecore (frames 23->18; pstream_packet_callback, pa_context_simple_ack_callback) which:
- calls a Python callback (down to frame 15),
- ...which invokes python code down to frame 3
- ...where it calls back into native code; whereupon the segfault happens, calling Py_DecRef on some object pointer.
Current state-of-the-art for debugging CPython backtraces
Python already has a gdbinit file with plenty of domain-specific hooks for debugging CPython, and we ship it in our python-devel
subpackage. If you copy this to ~/.gdbinit
you can then use "pyframe" and other commands to debug things, and figure out where we are in Python code from gdb. I used it when deciphering the example backtrace referred to above.
Unfortunately:
- this script isn't very robust; if the data in the "inferior" process is corrupt, attempting to print it can lead to a SIGSEGV within that process
- you have to go into gdb manually and run these commands by hand
- the script is written in the gdb language and is thus hard to work with and extend
Proposal
gdb should provide rich information on what's going on at the Python level automatically. I plan to hook this in using gdb-archer, and make it automatic:
- Biggest win: automatically display python frame information in PyEval_EvalFrameEx in gdb backtraces, including in ABRT:
- python source file, line number, and function names
- values of locals, if available
- name of function for wrapped C functions
See Alex's work: http://blogs.gnome.org/alexl/2008/11/18/gdb-is-dead-long-live-gdb/
and more recently: http://blogs.gnome.org/alexl/2009/09/21/archer-gdb-macros-for-glib/
I'd want to have the python backtrace work integrated with the glib backtrace work: pygtk regularly shows me backtrace with a mixture of both
Alex's work is in in glib git: http://git.gnome.org/browse/glib/commit/?id=efe9169234e226f594b4254618f35a139338c35f which does a:
gdb.backtrace.push_frame_filter (GFrameFilter)
See http://tromey.com/blog/?p=522 for info on this.
This needs a more recent version of gdb than in F-12; I'll need to build a local copy of "archer-tromey-python" branch of gdb to work on this.
Archer upstream: http://sourceware.org/gdb/wiki/ProjectArcher
Currently I'm stuck on this issue: http://sourceware.org/ml/archer/2009-q4/msg00129.html
Benefit to Fedora
Backtraces from gdb (such as those from ABRT) that involve python code will show what's going on at the Python level, as well as at the C level. This will make it much easier for developers to read backtraces when a library wrapped by python encounters a bug (e.g. PyGTK)
For python developers, it should be possible to attach to a running python process using gdb, then run thread apply all backtrace
to get an overview of all C and Python code running in all threads within that process - I believe this ability would be unique to Fedora, and be valuable for Python developers seeking additional visibility into their CPython processes.
Scope
How To Test
Ideas for test cases/coverage:
- try attaching to a running (multithreaded) python process and ensure that
thread apply all backtrace
generates meaningful results - ensure it plays well with Alex's GLib/GTK work; debug a multithreaded pygtk app
- ensure it fails gracefully if python-debuginfo isn't installed
- ensure that it fails gracefully if the inferior process has corrupted data (e.g. overwrites on the heap)
- ensure that it fails gracefully if the inferior process has a corrupted stack
- ensure that it works well under ABRT