From Fedora Project Wiki

Line 6: Line 6:


The default behaviour for ld is to not link objects that are listed as dependencies of another linked object. This is dangerous if the other object is ever changed to occlude the object on which your program depended, causing your program to break without any change to your code.
The default behaviour for ld is to not link objects that are listed as dependencies of another linked object. This is dangerous if the other object is ever changed to occlude the object on which your program depended, causing your program to break without any change to your code.
A concrete example:
libxml2.so has:
  NEEDED            Shared library: [libdl.so.2]
  NEEDED            Shared library: [libz.so.1]
Under the old system, a program that links with libxml2 and uses dlopen may not link with libdl, and a program that links with libxml2 and uses gzopen may not link with libz. While these programs will work, they are will break if libxml2 is ever changed to omit the dependency on libdl/libz.


== What's the difference? ==
== What's the difference? ==
Line 35: Line 44:




=== Current ===
'''(This Succeeds)'''
'''(This Succeeds)'''


Line 53: Line 63:




'''What it meant to say was:'''
=== Proposed ===


<code>gcc -Wl,--no-add-needed -o foo1 foo1.o foo2.so -Wl,--rpath-link=. -B/tmp/</code>
Both commands will return the same failed result, with an error message prompting you to add foo3.so to the command.


<code>/tmp/ld: ./foo3.so: invalid DSO for symbol `foo' definition</code>
'''(This Fails)'''


<code>./foo3.so: could not read symbols: Bad value</code>
<code>gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.</code>


<code>collect2: ld returned 1 exit status</code>
<code>/usr/bin/ld: foo1.o: undefined reference to symbol 'foo'</code>


<code>[Exit 1]</code>
<code>/usr/bin/ld: note: 'foo' is defined in DSO ./foo3.so so try adding it to the linker command line</code>


'''(This Fails)'''


So, the difference is whether you can refer to a symbol that's in a DSO
<code>gcc -Wl,--no-add-needed -o foo1 foo1.o foo2.so -Wl,--rpath-link=.</code>
that you didn't list explicitly in your link line, but that is a
DT_NEEDED dependency of one of those (or recursively of those, I think).
 
I find that error message not very explanatory, but it's what it says.
Giving a generic "undefined symbol" error (which usually comes with
source line info for the reference) would be less strange but also
perhaps too generic for this specially weird case.
 
 
'''New result:'''
 
<code>gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.</code>


<code>/usr/bin/ld: foo1.o: undefined reference to symbol 'foo'</code>
<code>/usr/bin/ld: foo1.o: undefined reference to symbol 'foo'</code>
Line 84: Line 83:
<code>/usr/bin/ld: note: 'foo' is defined in DSO ./foo3.so so try adding it to the linker command line</code>
<code>/usr/bin/ld: note: 'foo' is defined in DSO ./foo3.so so try adding it to the linker command line</code>


So, the difference is whether you can refer to a symbol that's in a DSO
that you didn't list explicitly in your link line, but that is a
DT_NEEDED dependency of one of those (or recursively of those, I think).


The big difference is that with the proposed change in place, ld will no longer skip linking needed libraries by default. The current default behaviour will lead ld to skip linking with a library if it is listed as a needed by another library that the program uses. In abstract terms, if libA is needed by libB and your program requires both libA and libB, your program may only link to libB. Then if another version of libB comes out that does not list libA as a needed library, then a recompilation will mysteriously break.
The big difference is that with the proposed change in place, ld will no longer skip linking needed libraries by default. The current default behaviour will lead ld to skip linking with a library if it is listed as a needed by another library that the program uses. In abstract terms, if libA is needed by libB and your program requires both libA and libB, your program may only link to libB. Then if another version of libB comes out that does not list libA as a needed library, then a recompilation will mysteriously break.


A concrete example from Roland McGrath:
libxml2.so has:
  NEEDED            Shared library: [libdl.so.2]
  NEEDED            Shared library: [libz.so.1]
In this case, a program that links with libxml2 and uses dlopen may not link with libdl, and a program that links with libxml2 and uses gzopen may not link with libz. While these programs will work, they are at risk of failure if libxml2 is ever changed to omit the dependency on libdl/libz.


== What do I do? ==
== What do I do? ==

Revision as of 19:21, 26 November 2009

Understanding the (Proposed) Change to DSO Linking

Basics

The default behaviour for ld is to not link objects that are listed as dependencies of another linked object. This is dangerous if the other object is ever changed to occlude the object on which your program depended, causing your program to break without any change to your code.

A concrete example:

libxml2.so has:

 NEEDED            Shared library: [libdl.so.2]
 NEEDED            Shared library: [libz.so.1]

Under the old system, a program that links with libxml2 and uses dlopen may not link with libdl, and a program that links with libxml2 and uses gzopen may not link with libz. While these programs will work, they are will break if libxml2 is ever changed to omit the dependency on libdl/libz.

What's the difference?

For example (courtesy Roland McGrath):

 ==> foo1.c <==
 #include <stdio.h>
 extern int foo ();
 int
 main ()
 {
   printf ("%d\n", foo ());
 }
 ==> foo2.c <==
 extern int foo ();
 int bar () { return foo (); }
 ==> foo3.c <==
 int foo () { return 0; }


gcc -g -fPIC -c foo1.c foo2.c foo3.c

gcc -shared -o foo3.so foo3.o

gcc -shared -o foo2.so foo2.o foo3.so


Current

(This Succeeds)

gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.


(This Fails)

gcc -Wl,--no-add-needed -o foo1 foo1.o foo2.so -Wl,--rpath-link=.

/usr/bin/ld: �: invalid DSO for symbol `foo' definition

./foo3.so: could not read symbols: Bad value

collect2: ld returned 1 exit status

[Exit 1]


Proposed

Both commands will return the same failed result, with an error message prompting you to add foo3.so to the command.

(This Fails)

gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.

/usr/bin/ld: foo1.o: undefined reference to symbol 'foo'

/usr/bin/ld: note: 'foo' is defined in DSO ./foo3.so so try adding it to the linker command line

(This Fails)

gcc -Wl,--no-add-needed -o foo1 foo1.o foo2.so -Wl,--rpath-link=.

/usr/bin/ld: foo1.o: undefined reference to symbol 'foo'

/usr/bin/ld: note: 'foo' is defined in DSO ./foo3.so so try adding it to the linker command line


So, the difference is whether you can refer to a symbol that's in a DSO that you didn't list explicitly in your link line, but that is a DT_NEEDED dependency of one of those (or recursively of those, I think).

The big difference is that with the proposed change in place, ld will no longer skip linking needed libraries by default. The current default behaviour will lead ld to skip linking with a library if it is listed as a needed by another library that the program uses. In abstract terms, if libA is needed by libB and your program requires both libA and libB, your program may only link to libB. Then if another version of libB comes out that does not list libA as a needed library, then a recompilation will mysteriously break.


What do I do?

The error message will prompt you to explicitly link to the DSO that you need. From the foo example, adding foo3.so will get rid of the error:

gcc -o foo1 foo1.o foo2.so foo3.so -Wl,--rpath-link=.