Memory Debugging Tools
The gdb debugger has been extended with new commands that make it easier to track down and fix excessive memory usage within programs and libraries.
This functionality was created by Fedora contributor David Malcolm, and we believe it is unique to Fedora.
- Name: Dave Malcolm
- Email: <email@example.com>
- Targeted release: Fedora 14
- Last updated: 2010-09-16
- Percentage of completion: 100%
TODO: This is "feature-complete", but some issues remain:
- I need to blog about this and write better docs
- Fix the bugs
- Preparing upstream project for initial launch: https://fedorahosted.org/gdb-heap/
- I've disabled C++ support for now, as the current implementation slows down other operations.
Initial version of code uploaded
The new "gdb-heap" package adds a new "heap" command to /usr/bin/gdb.
The command allows you to get a breakdown of how that process is using dynamic memory.
It allows for unplanned memory usage debugging: if a process unexpectedly starts using large amounts of memory you can attach to it with gdb, and use the heap command to figure out where the memory is going. You should also be able to use it on core dumps.
We believe this approach is entirely new, and is unique to Fedora 14.
Benefit to Fedora
This feature could be of great use to developers and system administrators: it provides a new way of analyzing how a process uses memory, without requiring advance planning.
It is unique to Fedora (it makes heavy use of the gdb/python integration we have in Fedora), and was developed by a Fedora contributor (who is a Red Hat engineer).
Code is isolated, as an extension to gdb, written in Python.
- I'm tracking development of the code in the upstream tracker here:
- Package the code in RPM form, add it to Fedora
- Ensure that it's available without the user needing excessive configuration; ideally, if the rpm is installed, then you get the command automagically
- Add it to comps so that it's suggested for installed by default if gdb is installed.
How To Test
No special hardware is needed.
You will need to install the gdb-heap package (not yet packaged)
- Pick a process on your system (either as root, or one of your own processes)
- Use "gdb attach PID" to connect to it
python import heapto register the "heap" command
- Use the "heap" command and its various subcommands (as described on the upstream website)
- Ensure that all results look correct, and that there are no Python tracebacks within gdb.
Ideally the amount of "uncategorized" data should not be a substantial proportion of the overall size of the dynamically-allocated memory; if it is, then that may be a bug.
Ideally the command should not take too long to run. The more blocks of memory that are "live" within a process, the longer it will take to analyze the usage. Crude timings suggest it can analyze about 5000 allocations per second, so if you have a process with 300,000 allocations, it could take a minute to analyze them.
Having attached to a process with gdb
[david@fedora-14] $ gdb attach $(pidof -x name-of-program)
you should be able to use the "heap" command to get a breakdown of how that process is using memory.
You can also do this with core dumps:
[david@fedora-14] $ gdb -c core.1976
In this example, I've attached gdb to a python process:
(gdb) heap Domain Kind Detail Count Allocated size ------------- -------------------------- ------------------ ------ -------------- python str 6,689 477,840 cpython PyDictEntry table 167 456,944 cpython PyDictEntry table interned 1 200,704 python str bytecode 648 92,024 uncategorized 32 bytes 2,866 91,712 python code 648 82,944 uncategorized 4128 bytes 19 78,432 python function 609 73,080 python wrapper_descriptor 905 72,400 python dict 247 71,200 uncategorized 72 bytes 852 61,344 (snipped)
As you can see, gdb-heap will attempt to categorize the chunks of dynamically-allocated memory that it finds. It shows you how many blocks of memory of each category it found, with the categories sorted by the number of bytes of RAM that they're using.
The categorization is divided into three parts:
- domain: high-level grouping e.g. "python", "C++", etc
- kind: type information, appropriate to the domain e.g. a class/type
- detail: additional detail (e.g. the size of a buffer, or a note that this python string is actually bytecode)
|Domain||Meaning of 'kind'|
||the python class|
||C structure/type (implementation detail within Python)|
||Python's optimized memory allocator|
||(none; gdb-heap wasn't able to identify what this is used for)|
|<code>C++||the C++ class (disabled for now in Fedora 14's gdb-heap; the heuristic needs to be optimized)|
You can see in the above example that much of the memory is taken up by python strings (the "str" type), but a considerable amount is also occupied by implementation details of python dictionaries (the "PyDictEntry tables").
There are numerous subcommands. heap is integrated into gdb's tab-completion, so that you can see the available commands with the TAB key:
(gdb) heap [TAB pressed] all diff label log sizes used
Here's a tour of what's available. Refer to the upstream documentation for more information.
Showing all dynamic memory
"heap all" shows a detailed, low-level report on all dynamically-allocated chunks of memory. This is a simple loop through memory, typically showing you the large allocations first (implemented via "mmap"), then the smaller ones (implemented within the "sbrk" region).
It reports the start/end of each region, along with book-keeping information about the block.
This is likely to only be of use for debugging low-level problems.
(gdb) heap all All chunks of memory on heap (both used and free) ------------------------------------------------- 0: 0x00007ffff08cd000 -> 0x00007ffff090dfff inuse: 266240 bytes (<MChunkPtr chunk=0x7ffff08cd000 mem=0x7ffff08cd010 prev_size=0 IS_MMAPPED chunksize=266240 memsize=266224>) 1: 0x00007ffff7ea7000 -> 0x00007ffff7ee7fff inuse: 266240 bytes (<MChunkPtr chunk=0x7ffff7ea7000 mem=0x7ffff7ea7010 prev_size=0 IS_MMAPPED chunksize=266240 memsize=266224>) (copious output snipped)
Finding blocks of RAM (query language)
There's a baseline of functionality that I'm developing on top of Fedora 13's gdb.
The gdb-heap code peeks around inside the internals of the glibc heap implementation, violating encapsulation (rather by definition for a debugger), so if that changes, corresponding changes will need to be made to gdb-heap.
Some features require additional work in gdb, which I've filed RFE bugs for. Naturally this will require coordination with gdb to ensure that they land in Fedora 14:
- RHBZ #610241: RFE: please expose "info symbol ADDRESS" in the python API
- RHBZ #610249: RFE: notification about changes in the inferior process
None necessary, simply remove the package
- See above, and at the project's website.
- The gdb debugger has been extended with new commands that make it easier to track down and fix excessive memory usage within programs and libraries. This functionality was created by Fedora contributor David Malcolm, and we believe it is unique to Fedora 14.