From Fedora Project Wiki

< Features

Revision as of 22:04, 25 March 2010 by Dmalcolm (talk | contribs) (→‎Current status: add link to bug 556975 to status)

Easier Python Debugging

Summary

The gdb debugger has been extended so that it can report detailed information on the internals of the Python 2 and Python 3 runtimes. Backtraces involving Python will now by default show mixed C and Python-level information on what such processes are doing, without requiring expertise in the use of gdb.

We believe this ability is unique to Fedora, and will be valuable for Python developers seeking additional visibility into their CPython processes.

Owner

  • Email: <dmalcolm@redhat.com>

Current status

  • Targeted release: Fedora 13
  • Last updated: 2010-03-25
  • Percentage of completion: 95%

DONE:

TODO:

This adds:

  • Auto-loading of the hooks, embedded in the relevant -debuginfo subpackages
  • New-style classes/python 3 classes
  • set/frozenset

Unfortunately, getting at frame information is being affected by this bug, which greatly diminishes the effectiveness of the hooks.

Notes

I was stuck on this issue when getting at "PyFrameObject *f" from the current frame, but in my current implementation I've sidestepped this by simply writing a pretty-printer for PyFrameObject* which gdb successfully invokes during a backtrace.

(Was also help up for a while by this now-fixed GCC issue.

See also this bug: https://bugzilla.redhat.com/show_bug.cgi?id=552654

First attempt at auto-loading for python 2 was built in rawhide for python 2 as python-2.6.4-14.fc13 but the core python rpm has gained a dep on "/builddir/build/BUILDROOT/python-2.6.4-14.fc13.i386/usr/lib/python2.6" (see http://koji.fedoraproject.org/koji/buildinfo?buildID=154675 and http://koji.fedoraproject.org/koji/rpminfo?rpmID=1803012 ).


Detailed Description

We ship Python wrappers for numerous libraries implemented in C and C++. Bugs (either in the libraries themselves, or in the usage of those libraries) can lead to complicated backtraces from gdb, and it can be hard to figure out what's going on at the python level.

For example, see this complex backtrace (relating to bug 536786) shows a segmentation fault somewhere inside a complicated call stack involving Python and other libraries, and it's not at all clear what's going on.

Walking through the stack frames, going up from the bottom (textually), or down from the top (numerically):

  • frames 26 and below show a pygtk application starting up.
  • An event comes in frame 24/25, and is dispatched into pulsecore (frames 23->18; pstream_packet_callback, pa_context_simple_ack_callback) which:
  • calls a Python callback (down to frame 15),
  • ...which invokes python code down to frame 3.
  • ...where it calls back into native code; whereupon the segfault happens, calling Py_DecRef on some object pointer.

Note that as it stands, all we see from the backtrace is that python code was run: we have no way as-is of telling what that python code was.

In the above example, it happens that there is a bug in the application's Python code, which is sufficiently serious to cause a SIGSEGV error. This example uses the ctypes module, which is designed to expose machine-level details. It's fairly easily to write a one-liner of python code using this module which causes the python process to immediately fail with either a SIGSEGV or SIGABRT.

When using "native" C/C++ libraries, it's sadly common for bugs in the library to leads to SIGSEGV errors that immediately cause the whole python process to terminate. Beyond that, poorly-designed error-handling in such libraries uses assert() or abort() at the C level, which immediately terminates the entire process. It's useful to be able to determine what was "really" going on when this happens.

A trickier problem is when a threading assertion fails: many libraries make assumptions about threads and locks, and allow the programmer to register callbacks, but imposes conditions upon the kind of code run in those callbacks. When the threads and callback-registration hooks are wrapped at the python level, these conditions continue to be required at the Python level, but mistakes here often lead to low-level error-handling that's difficult to debug.

For example, the GTK widget library requires that all communication with the X server happen within a GDK lock, to avoid garbling the single "conversation" between the process and the X server. The common way to implement this in a multi-threaded application is to restrict all calls to GTK to a single "primary" thread. See attachment 379251 to rhbug:543278 bug 543278 for an example of where a secondary thread in an application violates this, which leads to a low-level gdk_x_error() failure in the main thread: frames 16 to 28 of this backtrace are running Python code, but it's not at all clear from the backtrace _what_ said code is actually doing.

Current state-of-the-art for debugging CPython backtraces

Python already has a gdbinit file with plenty of domain-specific hooks for debugging CPython, and we ship it in our python-devel subpackage. If you copy this to ~/.gdbinit you can then use "pyframe" and other commands to debug things, and figure out where we are in Python code from gdb. I used it when deciphering the example backtraces referred to above.

Unfortunately:

  • this script isn't very robust - it effectively injects "print to stderr" calls into the process being debugged. If the data in the "inferior" process is corrupt, attempting to print it can lead to a SIGSEGV within that process.
  • you have to go into gdb manually and run these commands by hand, and it's hard to do this correctly; any mistakes when doing this will typically cause a SIGSEGV in the inferior process; see e.g. bug 532552
  • the script is written in the gdb language and is thus hard to work with and extend

Proposal

gdb should provide rich information on what's going on at the Python level automatically. I plan to hook this in using gdb-archer, and make it automatic:

  • Biggest win: automatically display python frame information in PyEval_EvalFrameEx in gdb backtraces, including in ABRT:
    • python source file, line number, and function names
    • values of locals, if available
  • name of function for wrapped C functions


See Alex's work: http://blogs.gnome.org/alexl/2008/11/18/gdb-is-dead-long-live-gdb/ and more recently: http://blogs.gnome.org/alexl/2009/09/21/archer-gdb-macros-for-glib/

I'd want to have the python backtrace work integrated with the glib backtrace work: pygtk regularly shows me backtraces with a mixture of both

Alex's work is in in glib git: http://git.gnome.org/browse/glib/commit/?id=efe9169234e226f594b4254618f35a139338c35f which does a:

 gdb.backtrace.push_frame_filter (GFrameFilter)

See http://tromey.com/blog/?p=522 for info on this.

This needs a more recent version of gdb than in F-12; I'll need to build a local copy of "archer-tromey-python" branch of gdb to work on this.

Archer upstream: http://sourceware.org/gdb/wiki/ProjectArcher

Benefit to Fedora

Backtraces from gdb (such as those from ABRT) that involve python code will show what's going on at the Python level, as well as at the C level. This will make it much easier for developers to read backtraces when a library wrapped by python encounters a bug (e.g. PyGTK)

For python developers, it should be possible to attach to a running python process using gdb, then run thread apply all backtrace to get an overview of all C and Python code running in all threads within that process - I believe this ability would be unique to Fedora, and be valuable for Python developers seeking additional visibility into their CPython processes.

Scope

This will require extensions to the python srpm, and analogous changes to the python3 srpm.

It may well require co-ordination with the gdb srpm (such as API changes), and with the glib2 changes written by Alex referred to above.

How To Test

Ideas for test cases/coverage:

  • try attaching to a running (multithreaded) python process and ensure that thread apply all backtrace generates meaningful results
  • ensure it plays well with Alex's GLib/GTK work; debug a multithreaded pygtk app
  • ensure it fails gracefully if python-debuginfo isn't installed
  • ensure that it fails gracefully if the inferior process has corrupted data (e.g. overwrites on the heap)
  • ensure that it fails gracefully if the inferior process has a corrupted stack
  • ensure that it works well under ABRT. It's easy to write one-liner python scripts that abuse the ctypes module in such a way as to cause /usr/bin/python to segfault/abort:
[david@brick ~]$ python -c "import ctypes; ctypes.string_at(0xffffffff)"
Segmentation fault (core dumped)
[david@brick ~]$ python -c "import ctypes; ctypes.string_at(0x0)"
python: Objects/stringobject.c:115: PyString_FromString: Assertion `str != ((void *)0)' failed.
Aborted (core dumped)
  • repeat all of the above for python3 and python3-debuginfo

In each case, gdb should give you meaningful information at the Python level, as well as at the C level.

User Experience

Here's an example session from running python within gdb:

[david@brick ~]$ gdb --args python
GNU gdb (GDB) 7.0
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/python...Reading symbols from /usr/lib/debug/usr/bin/python2.6.debug...done.
done.
(gdb) run
Starting program: /usr/bin/python 
[Thread debugging using libthread_db enabled]
Python 2.6.2 (r262:71600, Jan 25 2010, 13:22:47) 
[GCC 4.4.2 20100121 (Red Hat 4.4.2-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

At this point, Python is running inside gdb. Let's create a class with a serious bug in it:

>>> class Foo:
...     def bar(self):
...         from ctypes import string_at
...         string_at(0xDEADBEEF) # this code will cause Python to segfault
... 
>>> f = Foo()
>>>
>>> # Let's assign some data of various kinds to the instance:
>>> f.someattr = 42
>>> f.someotherattr = {'one':1, 'two':2L, 'three':[(), (None,), (None, None)]}
>>>
>>> # Now let's trigger the segfault
>>> f.bar()

At this point we've generated a segmentation fault inside Python.

Let's see the old behavior of a backtrace, using the "bt" command:

Program received signal SIGSEGV, Segmentation fault.
__strlen_sse2 () at ../sysdeps/i386/i686/multiarch/strlen.S:87
87		pcmpeqb	(%esi), %xmm0
Current language:  auto
The current source language is "auto; currently asm".
(gdb) bt
#0  __strlen_sse2 () at ../sysdeps/i386/i686/multiarch/strlen.S:87
#1  0x07113d30 in PyString_FromString (str=0xdeadbeef <Address 0xdeadbeef out of bounds>) at Objects/stringobject.c:116
#2  0x00167e18 in string_at (ptr=0xdeadbeef <Address 0xdeadbeef out of bounds>, size=-1) at /usr/src/debug/Python-2.6.2/Modules/_ctypes/_ctypes.c:5348
#3  0x0018247f in ffi_call_SYSV () at src/x86/sysv.S:61
#4  0x001822b0 in ffi_call (cif=<value optimized out>, fn=<value optimized out>, rvalue=<value optimized out>, avalue=<value optimized out>) at src/x86/ffi.c:213
#5  0x00171315 in _call_function_pointer (pProc=0x167de0 <string_at>, argtuple=0xb7f3d02c, flags=4357, argtypes=0xb7f45a4c, restype=0x80f3dc4, checker=0x0) at /usr/src/debug/Python-2.6.2/Modules/_ctypes/callproc.c:815
#6  _CallProc (pProc=0x167de0 <string_at>, argtuple=0xb7f3d02c, flags=4357, argtypes=0xb7f45a4c, restype=0x80f3dc4, checker=0x0) at /usr/src/debug/Python-2.6.2/Modules/_ctypes/callproc.c:1162
#7  0x0016a6f2 in CFuncPtr_call (self=0xb7f9d5dc, inargs=0xb7f3d02c, kwds=0x0) at /usr/src/debug/Python-2.6.2/Modules/_ctypes/_ctypes.c:3857
#8  0x070c478c in PyObject_Call (func=0xb7f9d5dc, arg=0xb7f3d02c, kw=0x0) at Objects/abstract.c:2492
#9  0x0716069c in do_call (f=0x80f37bc, throwflag=0) at Python/ceval.c:3917
#10 call_function (f=0x80f37bc, throwflag=0) at Python/ceval.c:3729
#11 PyEval_EvalFrameEx (f=0x80f37bc, throwflag=0) at Python/ceval.c:2389
#12 0x07162642 in PyEval_EvalCodeEx (co=0xb7f3bda0, globals=0xb7f3768c, locals=0x0, args=0x80ec788, argcount=1, kws=0x80ec78c, kwcount=0, defs=0xb7f45d78, defcount=1, closure=0x0) at Python/ceval.c:2968
#13 0x07160983 in fast_function (f=0x80ec644, throwflag=0) at Python/ceval.c:3802
#14 call_function (f=0x80ec644, throwflag=0) at Python/ceval.c:3727
#15 PyEval_EvalFrameEx (f=0x80ec644, throwflag=0) at Python/ceval.c:2389
#16 0x07161b79 in fast_function (f=0x80eb1cc, throwflag=0) at Python/ceval.c:3792
#17 call_function (f=0x80eb1cc, throwflag=0) at Python/ceval.c:3727
#18 PyEval_EvalFrameEx (f=0x80eb1cc, throwflag=0) at Python/ceval.c:2389
#19 0x07162642 in PyEval_EvalCodeEx (co=0xb7f2e578, globals=0xb7fc70b4, locals=0xb7fc70b4, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2968
#20 0x071627a3 in PyEval_EvalCode (co=0xb7f2e578, globals=0xb7fc70b4, locals=0xb7fc70b4) at Python/ceval.c:522
#21 0x0717d94b in run_mod (mod=<value optimized out>, filename=<value optimized out>, globals=0xb7fc70b4, locals=0xb7fc70b4, flags=0xbffff2fc, arena=0x80e8628) at Python/pythonrun.c:1335
#22 0x0717f4a6 in PyRun_InteractiveOneFlags (fp=0x5b5420, filename=0x71c3e7d "<stdin>", flags=0xbffff2fc) at Python/pythonrun.c:840
#23 0x0717f6ab in PyRun_InteractiveLoopFlags (fp=0x5b5420, filename=0x71c3e7d "<stdin>", flags=<value optimized out>) at Python/pythonrun.c:760
#24 0x0717f7eb in PyRun_AnyFileExFlags (fp=0x5b5420, filename=<value optimized out>, closeit=0, flags=0xbffff2fc) at Python/pythonrun.c:729
#25 0x0718c212 in Py_Main (argc=1, argv=0xbffff3f4) at Modules/main.c:599
#26 0x080485c7 in main (argc=1, argv=0xbffff3f4) at Modules/python.c:23

Note that although we can see that there's a problem inside ctypes, it's hard to see what triggered the problem within the Python code.

Fedora 13's python-debuginfo and python3-debuginfo packages now provide the gdb visualization hooks, installed in a location where gdb will automatically load them. (When I recorded the session, I had to import them by hand in gdb like this:

(gdb) python
>import sys
>sys.path.append('/home/david/coding/python-gdb')
>import libpython
>reload(libpython)
>
>end

With the gdb visualizations imported, let's rerun the "bt" command. Notice how this time we get rich information on the Python code:

(gdb) bt
#0  __strlen_sse2 () at ../sysdeps/i386/i686/multiarch/strlen.S:87
#1  0x07113d30 in PyString_FromString (str=0xdeadbeef <Address 0xdeadbeef out of bounds>) at Objects/stringobject.c:116
#2  0x00167e18 in string_at (ptr=0xdeadbeef <Address 0xdeadbeef out of bounds>, size=-1) at /usr/src/debug/Python-2.6.2/Modules/_ctypes/_ctypes.c:5348
#3  0x0018247f in ffi_call_SYSV () at src/x86/sysv.S:61
#4  0x001822b0 in ffi_call (cif=<value optimized out>, fn=<value optimized out>, rvalue=<value optimized out>, avalue=<value optimized out>) at src/x86/ffi.c:213
#5  0x00171315 in _call_function_pointer (pProc=0x167de0 <string_at>, argtuple=(3735928559L, -1), flags=4357, argtypes=(<builtin_function_or_method at remote 0xb7f45d2c>, <builtin_function_or_method at remote 0xb7f45d4c>), restype=
    <_ctypes.SimpleType at remote 0x80f3dc4>, checker=<unknown at remote 0x0>) at /usr/src/debug/Python-2.6.2/Modules/_ctypes/callproc.c:815

Note that before in frame 5, gdb merely reported argtuple=0xb7f3d02c. It is now able to tell us that we have a (long, int) 2-tuple: (3735928559L, -1) (this is actually (0xDEADBEEF, -1) but it has no way to know what base you want the number in).

Similarly, gdb is now telling us the types of the various objects. For example, in the baseline backtrace in frame 5 gdb merely reported restype=0x80f3dc4, but with this visualizer it is now able to tell us we have restype=<_ctypes.SimpleType at remote 0x80f3dc4>.

#6  _CallProc (pProc=0x167de0 <string_at>, argtuple=(3735928559L, -1), flags=4357, argtypes=(<builtin_function_or_method at remote 0xb7f45d2c>, <builtin_function_or_method at remote 0xb7f45d4c>), restype=
    <_ctypes.SimpleType at remote 0x80f3dc4>, checker=<unknown at remote 0x0>) at /usr/src/debug/Python-2.6.2/Modules/_ctypes/callproc.c:1162
#7  0x0016a6f2 in CFuncPtr_call (self=0xb7f9d5dc, inargs=(3735928559L, -1), kwds=<unknown at remote 0x0>) at /usr/src/debug/Python-2.6.2/Modules/_ctypes/_ctypes.c:3857
#8  0x070c478c in PyObject_Call (func=<CFunctionType at remote 0xb7f9d5dc>, arg=(3735928559L, -1), kw=<unknown at remote 0x0>) at Objects/abstract.c:2492
#9  0x0716069c in do_call (f=File /usr/lib/python2.6/ctypes/__init__.py, line 492, in string_at (ptr=3735928559L, size=-1), throwflag=0) at Python/ceval.c:3917

In frame 9 and below, we have a Python frame "f". In the old backtrace it was merely reported as f=0x80f37bc, but gdb is now able to tell us the Python file, line number, function, and locals:

 File /usr/lib/python2.6/ctypes/__init__.py, line 492, in string_at (ptr=3735928559L, size=-1)

We can use this to trace the Python-level stacktrace for this thread within gdb.

#10 call_function (f=File /usr/lib/python2.6/ctypes/__init__.py, line 492, in string_at (ptr=3735928559L, size=-1), throwflag=0) at Python/ceval.c:3729
#11 PyEval_EvalFrameEx (f=File /usr/lib/python2.6/ctypes/__init__.py, line 492, in string_at (ptr=3735928559L, size=-1), throwflag=0) at Python/ceval.c:2389
#12 0x07162642 in PyEval_EvalCodeEx (co=0xb7f3bda0, globals=
    {'Union': <_ctypes.UnionType at remote 0x17c120>, 'c_wchar': <_ctypes.SimpleType at remote 0x80fb32c>, 'c_bool': <_ctypes.SimpleType at remote 0x80fab54>, 'c_double': <_ctypes.SimpleType at remote 0x80f7a0c>, 'CFUNCTYPE': <function at remote 0xb7f3264c>, '__path__': ['/usr/lib/python2.6/ctypes'], 'byref': <builtin_function_or_method at remote 0xb7f40eec>, 'pointer': <builtin_function_or_method at remote 0xb7f40d6c>, 'alignment': <builtin_function_or_method at remote 0xb7f40eac>, '_memmove_addr': 4962832, 'c_longlong': <_ctypes.SimpleType at remote 0x80f8544>, 'c_short': <_ctypes.SimpleType at remote 0x80f407c>, 'get_errno': <builtin_function_or_method at remote 0xb7f39f4c>, '__file__': '/usr/lib/python2.6/ctypes/__init__.pyc', '_calcsize': <builtin_function_or_method at remote 0xb7f96bec>, 'c_ulong': <_ctypes.SimpleType at remote 0x80f5974>, 'c_int': <_ctypes.SimpleType at remote 0x80f5124>, 'c_int32': <_ctypes.SimpleType at remote 0x80f5124>, 'memmove': <CFunctionType at remote 0xb7f9d4a4>, '_sys': <module at remote 0xb7fa308c>, '_cast': <CFunctionType at remote 0xb7f9d574>, 'addressof': <builtin_function_or_method at remote 0xb7f40f0c>, 'ArgumentError': <type at remote 0x80f2fdc>, 'c_buffer': <function at remote 0xb7f32614>, 'c_longdouble': <_ctypes.SimpleType at remote 0x80f821c>, 'cdll': <LibraryLoader at remote 0xb7f459ac>, 'memset': <CFunctionType at remote 0xb7f9d50c>, 'string_at': <function at remote 0xb7f32a04>, 'sizeof': <builtin_function_or_method at remote 0xb7f40ecc>, '_FUNCFLAG_PYTHONAPI': 4, 'create_string_buffer': <function at remote 0xb7f325dc>, 'set_errno': <builtin_function_or_method at remote 0xb7f40d2c>, '_pointer_type_cache': {<_ctypes.SimpleType at remote 0x80f9ec4>: <_ctypes.PointerType at remote 0x80fb864>, <_ctypes.SimpleType at remote 0x80fb32c>: <_ctypes.PointerType at remote 0x80fb504>, <NoneType at remote 0x72061e0>: <_ctypes.SimpleType at remote 0x80fa6a4>}, '_Pointer': <_ctypes.PointerType at remote 0x17bd00>, 'create_unicode_buffer': <function at remote 0xb7f326bc>, 'c_long': <_ctypes.SimpleType at remote 0x80f5124>, 'c_char_p': <_ctypes.SimpleType at remote 0x80fa37c>, '__builtins__': {'bytearray': <type at remote 0x71fb540>, 'IndexError': <type at remote 0x71ff0e0>, 'all': <builtin_function_or_method at remote 0xb7fafccc>, 'help': <_Helper at remote 0xb7f814ec>, 'vars': <builtin_function_or_method at remote 0xb7fb280c>, 'SyntaxError': <type at remote 0x71fed60>, 'unicode': <type at remote 0x720e2c0>, 'sorted': <builtin_function_or_method at remote 0xb7fb274c>, 'isinstance': <builtin_function_or_method at remote 0xb7fb22cc>, 'copyright': <_Printer at remote 0xb7f81d6c>, 'NameError': <type at remote 0x71feac0>, 'BytesWarning': <type at remote 0x72006c0>, 'dict': <type at remote 0x7205960>, 'input': <builtin_function_or_method at remote 0xb7fb224c>, 'oct': <builtin_function_or_method at remote 0xb7fb246c>, 'bin': <builtin_function_or_method at remote 0xb7fafd8c>, 'SystemExit': <type at remote 0x71fe2e0>, 'StandardError': <type at remote 0x71fdf60>, 'format': <builtin_function_or_method at remote 0xb7fb20ac>, 'repr': <builtin_function_or_method at remote 0xb7fb268c>, 'UnicodeDecodeError': <type at remote 0x71ff540>, 'False': <bool at remote 0x71f9624>, 'RuntimeWarning': <type at remote 0x7200340>, 'bytes': <type at remote 0x7209c80>, 'iter': <builtin_function_or_method at remote 0xb7fb230c>, 'reload': <builtin_function_or_method at remote 0xb7fb264c>, 'Warning': <type at remote 0x71ffee0>, 'round': <builtin_function_or_method at remote 0xb7fb26cc>, 'dir': <builtin_function_or_method at remote 0xb7faff4c>, 'cmp': <builtin_function_or_method at remote 0xb7fafe4c>, 'set': <type at remote 0x7207000>, 'list': <type at remote 0x7204420>, 'reduce': <builtin_function_or_method at remote 0xb7fb260c>, 'intern': <builtin_function_or_method at remote 0xb7fb228c>, 'issubclass': <builtin_function_or_method at remote 0xb7fb22ec>, 'apply': <builtin_function_or_method at remote 0xb7fafd4c>, 'EOFError': <type at remote 0x71fe820>, 'locals': <builtin_function_or_method at remote 0xb7fb238c>, 'BufferError': <type at remote 0x71ffe00>, 'slice': <type at remote 0x7207900>, 'FloatingPointError': <type at remote 0x71ff8c0>, 'sum': <builtin_function_or_method at remote 0xb7fb278c>, 'buffer': <type at remote 0x71f9840>, 'getattr': <builtin_function_or_method at remote 0xb7fb20cc>, 'abs': <builtin_function_or_method at remote 0xb7fafc8c>, 'exit': <Quitter at remote 0xb7fd1d2c>, 'print': <builtin_function_or_method at remote 0xb7fb256c>, 'IndentationError': <type at remote 0x71fee40>, 'True': <bool at remote 0x71f9630>, 'FutureWarning': <type at remote 0x7200420>, 'ImportWarning': <type at remote 0x7200500>, 'None': <NoneType at remote 0x72061e0>, 'hash': <builtin_function_or_method at remote 0xb7fb218c>, 'len': <builtin_function_or_method at remote 0xb7fb234c>, 'credits': <_Printer at remote 0xb7f8156c>, 'frozenset': <type at remote 0x72070e0>, '__name__': '__builtin__', 'ord': <builtin_function_or_method at remote 0xb7fb24ec>, 'super': <type at remote 0x720ade0>, 'TypeError': <type at remote 0x71fe040>, 'license': <_Printer at remote 0xb7f8170c>, 'KeyboardInterrupt': <type at remote 0x71fe3c0>, 'UserWarning': <type at remote 0x71fffc0>, 'filter': <builtin_function_or_method at remote 0xb7fb206c>, 'range': <builtin_function_or_method at remote 0xb7fb25ac>, 'staticmethod': <type at remote 0x72035e0>, 'SystemError': <type at remote 0x71ffb60>, 'BaseException': <type at remote 0x7200a60>, 'pow': <builtin_function_or_method at remote 0xb7fb252c>, 'RuntimeError': <type at remote 0x71fe900>, 'float': <type at remote 0x72028a0>, 'GeneratorExit': <type at remote 0x71fe200>, 'StopIteration': <type at remote 0x71fe120>, 'globals': <builtin_function_or_method at remote 0xb7fb210c>, 'divmod': <builtin_function_or_method at remote 0xb7faff8c>, 'enumerate': <type at remote 0x71fdaa0>, 'Ellipsis': <ellipsis at remote 0x7207840>, 'LookupError': <type at remote 0x71ff000>, 'open': <builtin_function_or_method at remote 0xb7fb24ac>, 'quit': <Quitter at remote 0xb7fd120c>, 'basestring': <type at remote 0x7209ba0>, 'UnicodeError': <type at remote 0x71ff380>, 'zip': <builtin_function_or_method at remote 0xb7fb284c>, 'hex': <builtin_function_or_method at remote 0xb7fb21cc>, 'long': <type at remote 0x7204f60>, 'next': <builtin_function_or_method at remote 0xb7fb244c>, 'int': <type at remote 0x7203a40>, 'chr': <builtin_function_or_method at remote 0xb7fafe0c>, '__import__': <builtin_function_or_method at remote 0xb7fafc6c>, 'type': <type at remote 0x720ac20>, '__doc__': "Built-in functions, exceptions, and other objects.\n\nNoteworthy: None is the `nil' object; Ellipsis represents `...' in slices.", 'Exception': <type at remote 0x71fde80>, 'tuple': <type at remote 0x720a6a0>, 'UnicodeTranslateError': <type at remote 0x71ff620>, 'reversed': <type at remote 0x71fdb80>, 'UnicodeEncodeError': <type at remote 0x71ff460>, 'IOError': <type at remote 0x71fe660>, 'hasattr': <builtin_function_or_method at remote 0xb7fb214c>, 'delattr': <builtin_function_or_method at remote 0xb7faff0c>, 'setattr': <builtin_function_or_method at remote 0xb7fb270c>, 'raw_input': <builtin_function_or_method at remote 0xb7fb25ec>, 'PendingDeprecationWarning': <type at remote 0x7200180>, 'compile': <builtin_function_or_method at remote 0xb7fafecc>, 'ArithmeticError': <type at remote 0x71ff7e0>, 'str': <type at remote 0x7209c80>, 'property': <type at remote 0x71fcec0>, 'MemoryError': <type at remote 0x71ffd20>, 'ImportError': <type at remote 0x71fe4a0>, 'xrange': <type at remote 0x7206640>, 'KeyError': <type at remote 0x71ff1c0>, 'coerce': <builtin_function_or_method at remote 0xb7fafe8c>, 'SyntaxWarning': <type at remote 0x7200260>, 'file': <type at remote 0x7201d80>, 'EnvironmentError': <type at remote 0x71fe580>, 'unichr': <builtin_function_or_method at remote 0xb7fb27cc>, 'id': <builtin_function_or_method at remote 0xb7fb220c>, 'OSError': <type at remote 0x71fe740>, 'DeprecationWarning': <type at remote 0x72000a0>, 'min': <builtin_function_or_method at remote 0xb7fb242c>, 'UnicodeWarning': <type at remote 0x72005e0>, 'execfile': <builtin_function_or_method at remote 0xb7fb202c>, '__package__': <NoneType at remote 0x72061e0>, 'complex': <type at remote 0x71fc840>, 'bool': <type at remote 0x71f9560>, 'ValueError': <type at remote 0x71ff2a0>, 'NotImplemented': <NotImplementedType at remote 0x72061e8>, 'map': <builtin_function_or_method at remote 0xb7fb23cc>, 'any': <builtin_function_or_method at remote 0xb7fafd0c>, 'max': <builtin_function_or_method at remote 0xb7fb240c>, 'object': <type at remote 0x720ad00>, 'TabError': <type at remote 0x71fef20>, 'callable': <builtin_function_or_method at remote 0xb7fafdcc>, 'ZeroDivisionError': <type at remote 0x71ffa80>, 'eval': <builtin_function_or_method at remote 0xb7faffcc>, '__debug__': <bool at remote 0x71f9630>, 'ReferenceError': <type at remote 0x71ffc40>, 'AssertionError': <type at remote 0x71ff700>, 'classmethod': <type at remote 0x7203500>, 'UnboundLocalError': <type at remote 0x71feba0>, 'NotImplementedError': <type at remote 0x71fe9e0>, 'AttributeError': <type at remote 0x71fec80>, 'OverflowError': <type at remote 0x71ff9a0>}, '_FUNCFLAG_USE_ERRNO': 8, '_memset_addr': 4962944, '_dlopen': <builtin_function_or_method at remote 0xb7f40e0c>, '__name__': 'ctypes', 'RTLD_LOCAL': 0, 'c_int16': <_ctypes.SimpleType at remote 0x80f407c>, '_SimpleCData': <_ctypes.SimpleType at remote 0x17bde0>, 'wstring_at': <function at remote 0xb7f32a3c>, 'c_void_p': <_ctypes.SimpleType at remote 0x80fa6a4>, 'set_conversion_mode': <builtin_function_or_method at remote 0xb7f40dec>, 'PyDLL': <type at remote 0x80fc3dc>, 'DEFAULT_MODE': 0, 'LittleEndianStructure': <_ctypes.StructType at remote 0x17c040>, 'c_uint64': <_ctypes.SimpleType at remote 0x80f8d54>, 'c_ulonglong': <_ctypes.SimpleType at remote 0x80f8d54>, '_FUNCFLAG_USE_LASTERROR': 16, '_cast_addr': 1490912, 'ARRAY': <function at remote 0xb7f3279c>, 'c_ushort': <_ctypes.SimpleType at remote 0x80f48ac>, '__doc__': 'create and manipulate C data types in Python', '_check_size': <function at remote 0xb7f32684>, 'CDLL': <type at remote 0x80fbeac>, '_wstring_at': <CFunctionType at remote 0xb7f9d644>, 'c_ubyte': <_ctypes.SimpleType at remote 0x80f9564>, 'RTLD_GLOBAL': 256, 'c_char': <_ctypes.SimpleType at remote 0x80f9ec4>, 'c_uint32': <_ctypes.SimpleType at remote 0x80f5974>, 'c_float': <_ctypes.SimpleType at remote 0x80f71fc>, 'SetPointerType': <function at remote 0xb7f32764>, 'resize': <builtin_function_or_method at remote 0xb7f40dcc>, '_c_functype_cache': {(<_ctypes.SimpleType at remote 0x80f5124>, (), 1): <_ctypes.CFuncPtrType at remote 0x8100b14>, (<_ctypes.SimpleType at remote 0x80fa6a4>, (<_ctypes.SimpleType at remote 0x80fa6a4>, <_ctypes.SimpleType at remote 0x80f5124>, <_ctypes.SimpleType at remote 0x80f5974>), 1): <_ctypes.CFuncPtrType at remote 0x80fd84c>}, '---Type <return> to continue, or q <return> to quit---
_os': <module at remote 0xb7fa314c>, '_wstring_at_addr': 1494192, 'cast': <function at remote 0xb7f329cc>, 'c_int8': <_ctypes.SimpleType at remote 0x80f9a14>, 'c_byte': <_ctypes.SimpleType at remote 0x80f9a14>, 'c_int64': <_ctypes.SimpleType at remote 0x80f8544>, 'c_voidp': <_ctypes.SimpleType at remote 0x80fa6a4>, '_string_at_addr': 1474016, '_FUNCFLAG_CDECL': 1, 'pythonapi': <PyDLL at remote 0xb7f45a0c>, 'PYFUNCTYPE': <function at remote 0xb7f327d4>, '_CFuncPtr': <_ctypes.CFuncPtrType at remote 0x17bb40>, '_endian': <module at remote 0xb7fa3944>, '__package__': 'ctypes', 'c_uint16': <_ctypes.SimpleType at remote 0x80f48ac>, 'BigEndianStructure': <_swapped_meta at remote 0x81006ec>, 'pydll': <LibraryLoader at remote 0xb7f459ec>, '__version__': '1.1.0', 'Structure': <_ctypes.StructType at remote 0x17c040>, 'c_uint': <_ctypes.SimpleType at remote 0x80f5974>, 'py_object': <_ctypes.SimpleType at remote 0x80f3dc4>, 'c_wchar_p': <_ctypes.SimpleType at remote 0x80fae7c>, '_string_at': <CFunctionType at remote 0xb7f9d5dc>, 'c_size_t': <_ctypes.SimpleType at remote 0x80f5974>, 'c_uint8': <_ctypes.SimpleType at remote 0x80f9564>, 'LibraryLoader': <type at remote 0x80fc704>, 'Array': <_ctypes.ArrayType at remote 0x17bc20>, 'POINTER': <builtin_function_or_method at remote 0xb7f40d4c>}, locals=<unknown at remote 0x0>, args=0x80ec788, argcount=1, kws=0x80ec78c, kwcount=0, defs=0xb7f45d78, defcount=1, closure=
    <unknown at remote 0x0>) at Python/ceval.c:2968

The above is probably overkill: gdb is now able to tell us the value of "globals", giving us lots of insight into the namespace.

#13 0x07160983 in fast_function (f=
    File <stdin>, line 2, in bar (self=<Foo({'someattr': 42, 'someotherattr': {'three': [(), (<NoneType at remote 0x72061e0>,), (<NoneType at remote 0x72061e0>, <NoneType at remote 0x72061e0>)], 'two': 2L}}) at remote 0xb7f3946c>, string_at=<function at remote 0xb7f32a04>), throwflag=0) at Python/ceval.c:3802
#14 call_function (f=
    File <stdin>, line 2, in bar (self=<Foo({'someattr': 42, 'someotherattr': {'three': [(), (<NoneType at remote 0x72061e0>,), (<NoneType at remote 0x72061e0>, <NoneType at remote 0x72061e0>)], 'two': 2L}}) at remote 0xb7f3946c>, string_at=<function at remote 0xb7f32a04>), throwflag=0) at Python/ceval.c:3727
#15 PyEval_EvalFrameEx (f=
    File <stdin>, line 2, in bar (self=<Foo({'someattr': 42, 'someotherattr': {'three': [(), (<NoneType at remote 0x72061e0>,), (<NoneType at remote 0x72061e0>, <NoneType at remote 0x72061e0>)], 'two': 2L}}) at remote 0xb7f3946c>, string_at=<function at remote 0xb7f32a04>), throwflag=0) at Python/ceval.c:2389

In the above frames, notice how gdb is now able to tell us that this instance of an old-style class is of type "Foo" and the current values of its attributes (I deliberately picked a mixture above in order to show support for dictionaries, lists, tuples, ints, longs etc).

#16 0x07161b79 in fast_function (f=File <stdin>, line 1, in <module> (), throwflag=0) at Python/ceval.c:3792
#17 call_function (f=File <stdin>, line 1, in <module> (), throwflag=0) at Python/ceval.c:3727
#18 PyEval_EvalFrameEx (f=File <stdin>, line 1, in <module> (), throwflag=0) at Python/ceval.c:2389
#19 0x07162642 in PyEval_EvalCodeEx (co=0xb7f2e578, globals=
    {'f': <Foo({'someattr': 42, 'someotherattr': {'three': [(), (<NoneType at remote 0x72061e0>,), (<NoneType at remote 0x72061e0>, <NoneType at remote 0x72061e0>)], 'two': 2L}}) at remote 0xb7f3946c>, '__builtins__': <module at remote 0xb7fa3074>, '__package__': <NoneType at remote 0x72061e0>, '__name__': '__main__', 'Foo': <classobj at remote 0xb7f3817c>, '__doc__': <NoneType at remote 0x72061e0>}, locals=
    {'f': <Foo({'someattr': 42, 'someotherattr': {'three': [(), (<NoneType at remote 0x72061e0>,), (<NoneType at remote 0x72061e0>, <NoneType at remote 0x72061e0>)], 'two': 2L}}) at remote 0xb7f3946c>, '__builtins__': <module at remote 0xb7fa3074>, '__package__': <NoneType at remote 0x72061e0>, '__name__': '__main__', 'Foo': <classobj at remote 0xb7f3817c>, '__doc__': <NoneType at remote 0x72061e0>}, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, 
    closure=<unknown at remote 0x0>) at Python/ceval.c:2968
#20 0x071627a3 in PyEval_EvalCode (co=0xb7f2e578, globals=
    {'f': <Foo({'someattr': 42, 'someotherattr': {'three': [(), (<NoneType at remote 0x72061e0>,), (<NoneType at remote 0x72061e0>, <NoneType at remote 0x72061e0>)], 'two': 2L}}) at remote 0xb7f3946c>, '__builtins__': <module at remote 0xb7fa3074>, '__package__': <NoneType at remote 0x72061e0>, '__name__': '__main__', 'Foo': <classobj at remote 0xb7f3817c>, '__doc__': <NoneType at remote 0x72061e0>}, locals=
    {'f': <Foo({'someattr': 42, 'someotherattr': {'three': [(), (<NoneType at remote 0x72061e0>,), (<NoneType at remote 0x72061e0>, <NoneType at remote 0x72061e0>)], 'two': 2L}}) at remote 0xb7f3946c>, '__builtins__': <module at remote 0xb7fa3074>, '__package__': <NoneType at remote 0x72061e0>, '__name__': '__main__', 'Foo': <classobj at remote 0xb7f3817c>, '__doc__': <NoneType at remote 0x72061e0>}) at Python/ceval.c:522
#21 0x0717d94b in run_mod (mod=<value optimized out>, filename=<value optimized out>, globals=
    {'f': <Foo({'someattr': 42, 'someotherattr': {'three': [(), (<NoneType at remote 0x72061e0>,), (<NoneType at remote 0x72061e0>, <NoneType at remote 0x72061e0>)], 'two': 2L}}) at remote 0xb7f3946c>, '__builtins__': <module at remote 0xb7fa3074>, '__package__': <NoneType at remote 0x72061e0>, '__name__': '__main__', 'Foo': <classobj at remote 0xb7f3817c>, '__doc__': <NoneType at remote 0x72061e0>}, locals=
    {'f': <Foo({'someattr': 42, 'someotherattr': {'three': [(), (<NoneType at remote 0x72061e0>,), (<NoneType at remote 0x72061e0>, <NoneType at remote 0x72061e0>)], 'two': 2L}}) at remote 0xb7f3946c>, '__builtins__': <module at remote 0xb7fa3074>, '__package__': <NoneType at remote 0x72061e0>, '__name__': '__main__', 'Foo': <classobj at remote 0xb7f3817c>, '__doc__': <NoneType at remote 0x72061e0>}, flags=0xbffff2fc, arena=0x80e8628) at Python/pythonrun.c:1335
#22 0x0717f4a6 in PyRun_InteractiveOneFlags (fp=0x5b5420, filename=0x71c3e7d "<stdin>", flags=0xbffff2fc) at Python/pythonrun.c:840
#23 0x0717f6ab in PyRun_InteractiveLoopFlags (fp=0x5b5420, filename=0x71c3e7d "<stdin>", flags=<value optimized out>) at Python/pythonrun.c:760
#24 0x0717f7eb in PyRun_AnyFileExFlags (fp=0x5b5420, filename=<value optimized out>, closeit=0, flags=0xbffff2fc) at Python/pythonrun.c:729
#25 0x0718c212 in Py_Main (argc=1, argv=0xbffff3f4) at Modules/main.c:599
#26 0x080485c7 in main (argc=1, argv=0xbffff3f4) at Modules/python.c:23

We are installing pretty-printing hooks into gdb for the types (PyObject*) and (PyFrameObject*).

If you need to override this behavior to see the underlying data, simply dereference the pointer as normal (we're pretty-printing the pointer types, not the types themselves).

For example, the pretty-printer is invoked for this value:

(gdb) p (PyObject*)0x8405df4
$3 = <function at remote 0x8405df4>

But you can see the underlying value thus:

(gdb) p *(PyObject*)0x8405df4
$4 = {ob_refcnt = 23, ob_type = 0x7203420}

Similarly, this PyObject* value is pretty-printed:

(gdb) p ((PyFunctionObject*)0x8405df4)->func_code
$8 = <code at remote 0x82787b8>

but dereferencing it gives the raw representation:

(gdb) p *((PyFunctionObject*)0x8405df4)->func_code
$9 = {ob_refcnt = 1, ob_type = 0x71fc4e0}

and we can mix and match this to dive into the data:

(gdb) p *(PyCodeObject*)(((PyFunctionObject*)0x8405df4)->func_code)
$10 = {ob_refcnt = 1, ob_type = 0x71fc4e0, co_argcount = 0, co_nlocals = 0, co_stacksize = 1, co_flags = 67, co_code = 'd', co_consts = (<NoneType at remote 0x72061e0>,), co_names = (), co_varnames = (), co_freevars = (), co_cellvars = (), co_filename = '/usr/lib/python2.6/site-packages/blueman/Functions.py', co_name = 'enable_rgba_colormap', co_firstlineno = 127, co_lnotab = '', co_zombieframe = 0x0}

Dependencies

This feature will require coordination with, and possible changes in, the gdb, and glib2 packages.

Contingency Plan

The contingency plan would be to remove the additional .py files, deactivating the feature.

Documentation

See the "Detailed Description" section above; this feature page contains much information.

Release Notes

  • Python: the gdb debugger has been extended so that it can report detailed information on the internals of the Python 2 and Python 3 runtimes. Backtraces involving Python will now by default show mixed C and Python-level information on what such processes are doing, without requiring expertise in the use of gdb

Comments and Discussion