Summer Coding 2010 ideas - Universal Build-ID

From FedoraProject

(Difference between revisions)
Jump to: navigation, search
m (Better formatting of eu-unstrip -n examples)
(Add summary and better outline)
Line 9: Line 9:
 
<!-- Or use the area for a freeform write-up about your idea. -->
 
<!-- Or use the area for a freeform write-up about your idea. -->
 
<!-- Keep in mind that the work should take fro 6 to 10 weeks (250 to 400 hours). -->
 
<!-- Keep in mind that the work should take fro 6 to 10 weeks (250 to 400 hours). -->
 +
 +
=== Summary ===
 +
 +
Build-IDs are currently being put into binaries, shared libraries, core files and related debuginfo files to uniquely identify the build a user or developer is working with. There are a couple of conventions in place to use this information to identify "currently running" or "distro installed" builds. This helps with identifying what was being run and match it to the corresponding package, sources and debuginfo for tools that want to help the user show what is going on (at the moment mostly when things break). We would like to extend this to a more universial approach, that helps people identify historical, local, non- or cross-distro or organisational builds. So that Build-IDs become useful outside the current "static" setup and retrain information over time and across upgrades.
 +
 +
=== Build-ID background ===
  
 
Since Fedora Core 8 there is support for build-ids.
 
Since Fedora Core 8 there is support for build-ids.
Line 17: Line 23:
 
find the build-id in a running process, a core file or a separate
 
find the build-id in a running process, a core file or a separate
 
debuginfo file.
 
debuginfo file.
 +
 +
==== Getting Build-IDs ====
  
 
A simple way to get the build-id(s) is through eu-unstrip (part of elfutils).
 
A simple way to get the build-id(s) is through eu-unstrip (part of elfutils).
Line 39: Line 47:
 
$ eu-unstrip -n -k
 
$ eu-unstrip -n -k
 
</pre>
 
</pre>
 +
 +
=== Current conventions and usage ===
  
 
The convention that is currently being used by Fedora (and which has
 
The convention that is currently being used by Fedora (and which has
Line 70: Line 80:
 
build-ids is what this idea is about.
 
build-ids is what this idea is about.
  
== How do we scale this idea up/down? ==
+
=== How do we scale this up/down? The actual Universial Build-IDs idea ===
  
 
The target is that when you get a build-id for something (anything
 
The target is that when you get a build-id for something (anything

Revision as of 10:39, 17 May 2010


The main page for this idea is Summer Coding 2010 ideas - Universal Build-ID.


Status: "Idea"

Summary of idea: Extend the Build-ID support to make it more universally usable.

Contacts: Mark Wielaard, Roland McGrath

Mentor(s): Mark Wielaard, Roland McGrath

Notes: This is not a completely worked out idea yet. A proposal should pick one or more scenarios and create a concrete implementation plan.

Contents

More information

The main page for Summer Coding 2010 ideas is Category:Summer Coding 2010 ideas.


Summary

Build-IDs are currently being put into binaries, shared libraries, core files and related debuginfo files to uniquely identify the build a user or developer is working with. There are a couple of conventions in place to use this information to identify "currently running" or "distro installed" builds. This helps with identifying what was being run and match it to the corresponding package, sources and debuginfo for tools that want to help the user show what is going on (at the moment mostly when things break). We would like to extend this to a more universial approach, that helps people identify historical, local, non- or cross-distro or organisational builds. So that Build-IDs become useful outside the current "static" setup and retrain information over time and across upgrades.

Build-ID background

Since Fedora Core 8 there is support for build-ids. https://fedoraproject.org/wiki/Releases/FeatureBuildId

Build-IDs are unique identifiers of "builds". A build is an executable, a shared library, the kernel, a module, etc. You can also find the build-id in a running process, a core file or a separate debuginfo file.

Getting Build-IDs

A simple way to get the build-id(s) is through eu-unstrip (part of elfutils).

  • build-id from an executable, shared library or separate debuginfo file:
$ eu-unstrip -n -e <exec|.sharedlib|.debug>
  • build-ids of an executable and all shared libraries from a core file:
$ eu-unstrip -n --core <corefile>
  • build-ids of an executable and all shared libraries of a running process:
$ eu-unstrip -n --pid <pid>
  • build-id of the running kernel and all loaded modules:
$ eu-unstrip -n -k

Current conventions and usage

The convention that is currently being used by Fedora (and which has been adopted by for example GDB to find files) is to include a link in the debuginfo file that points to the elf file and the debuginfo file under /usr/lib/debug/.build-id/XX/YYYY (where XX are the first two hex-digits of the build id and YYYY are all the others):

/usr/lib/debug/.build-id/c7/a002ba1eb1dbc7c609d2e5fb9a57f10861dbdd
 -> ../../../../../bin/bash
/usr/lib/debug/.build-id/c7/a002ba1eb1dbc7c609d2e5fb9a57f10861dbdd.debug
 -> ../../bin/bash.debug

This makes it extremely easy to find the executable or shared library and the corresponding debuginfo just given the build-id. If they are installed on your system.

Since these are files included in the rpm package, it also makes it easy to find the package that provided the executable/library, that corresponds to the build id (gdb and systemtap will suggest the right debuginfo package to install based on the build-id they found for the program you wanted to introspect). You can ask yum to install it, or use repoquery to figure out the details of the package and binary involved.

But this is only for the latest current/up-to-date installed repository. There is no support for historical information, local builds, cross-distro, etc. Extending the usefulness of having build-ids is what this idea is about.

How do we scale this up/down? The actual Universial Build-IDs idea

The target is that when you get a build-id for something (anything really, an old executable, a core file once made but never fully investigated, some currently running process that needs to be introspected but that has had its libraries upgraded on disk already) and mapping it to the original developer, "creator", package, distributor, executable, sources, debuginfo files, etc.

  • Up in fedora, what about getting "historical" mappings?
  • Up towards other distributions (packagekit?)
  • Up towards a general build-id mapping universe (build-id.org).
    • Generic registration, querying and mapping of build-ids
  • Down towards to local database for lone developer.
  • Or an local shop that builds upon an existing distro, but also has (internal) apps in their organization.
  • To totally disorganized "installs" where people move around executables all the time (inotify/updatedb).
  • How do we "proxy" this information between the different layers, so tools can have one query mechanism that works for any build-id that they happen to come across.
  • Tie-in to packagekit, abrt, debuginfo-fs?