Archive:Bundled Library Packaging Draft

Review Guidelines has
MUST: Packages must NOT bundle copies of system libraries.[11]

Duplication of system libraries
A package should not include or build against a local copy of a library that exists on a system. The package should be patched to use the system libraries. This prevents old bugs and security holes from living on after the core system libraries have been fixed. Some packages may be granted an exception to this. Please see the Packaging:No Bundled Libraries page for rationale, the process for being granted an exception, and the requirements if your package is bundling.

Why no Bundled Libraries
Although you can request an exception from FESCo there are many reasons not to grant one. These are the reasons that it's painful for us to have bundled libraries in the distribution. An exception should only be granted if the value of bundling exceeds these costs.

Security

 * When a security flaw is discovered in a library and bundling is not allowed, The library can be fixed in a single package, that package rebuilt, and when users download it, all the applications that use it are immediately protected. When bundling is allowed, the distribution has to find all the packages that the library occurs in by auditing source code or running a special tool over all elf files in all packages, then all of those packages have to be fixed, all of those packages have to be built, and users have to download and update each of the ones that they are using on their system before they are protected.  There is much more work involved when bundled libraries are involved.


 * With security issues, people want to remove as much lag as they can between announcement of a problem and the fix being available for users. When libraries are unbundled, tools like vendor-sec can be used to alert distributions of problems that need patching in their packages before the announcement is made and then they can fix them with zero days of vulnerability.  If bundling of libraries occurs, then the problem becomes how to get fixes out to all affected packages.  If the distribution patches those packages, they must be careful to not leak the fact that there is a security vulnerability before they are allowed (which means they need to be careful who they share the information and what information they share with others).  OTOH, if they do not patch the packages bundling libraries, then those packages are not protected on zero day, but only afterwards.


 * When a security flaw appears, the program has to either update to a non-affected version of the library or backport a fix. This can be problematic when the code of the library has undergone many API and code changes since the version that is being bundled and the security fixing patch is very widespread.  Many conflicts can arise that need time to fix when trying to backport the fixes but porting the application code to the new API version can also take a lot of time.


 * We cannot implicitly trust an upstream application to be on top of security issues that are released in the packages that they care about. What happens if you are not following boost development and don't know that a security release has been made?  What happens if the developer that is responsible for watching boost development goes on vacation or quits your project?  What happens if your application ceases active development?  What happens if boost stops active development and security fixes start originating with distro patches?

Forking
Forking is occurring. Once an application starts bundling libraries, it's easy for the project to include local patches to the library to add features that upstream doesn't have or fix bugs that upstream hasn't addressed. This has several negative effects.


 * When a security issue appears, it becomes harder to fix the application bundling the library. If you attempt to upgrade to a newer version, you have to make sure your important local modifications get ported to the new version. If you attempt to backport, you have to merge the upstream fix to your own code-base which may have conflicts with the local modifications.


 * When working with the library that comes from upstream, there is a community of people who are interested in that library to fall back on for help. When working on your own private copy that community may not be interested in helping you work on your modified sources since they don't have control or knowledge of what your modified sources do.


 * Forking dilutes one of the strengths of open-source development. Instead of a project getting stronger with more people supplying patches to help drive the project and build a bigger community, the community of people interested in it are splintering, developing more and more divergent code-bases, solving the same problem over and over in different ways in different private copies of the library.  Instead of everyone benefiting, everyone has to pay.

Bugfixes
Bugfixes are usually of lesser importance from security issues but share the same issues of hanging onto lingering problems that have been fixed in the main package.

Old Code

 * Old versions of code linger on. If the application can bundle its own version of a library, the incentive to port to newer versions of the library are reduced.  This exacerbates the problems of security and bugfix issues. Instead of progressively porting to newer versions of a library as time goes on, porting to newer versions becomes a chore that has to be performed at the same time as addressing a security flaw.  This puts time pressure on the project when the work could have been spread out over a longer period if only the porting had been done all along.

Licensing
Although licensing issues can crop up in any project, projects which bundle code from different sources together are a special source of concern. They make auditing for license issues a larger project.

When a Bundled Library is Discovered Post-Review
Bundling of libraries is a serious problem. If a package that is in the distribution is discovered to have bundled libraries we need to fix it. First, open a bug report against the package. Then add the bug to the Duplicate libraries tracker. Once that's done, if help is needed fixing the bug ask on the mailing list.

Exceptions
Exceptions are granted on a case-by-case basis by FESCo with input from FPC. You can look in the following section for help on making a case for why an exception should be granted.

Some reasons you might be granted an exception
This section lists some reasons that might convince FESCo that you have a valid reason to be granted an exception. Exceptions are granted on a case by case basis and satisfying the rationale here is not a guarantee of an exception but it's a place to start building your case for why the package you work on is exceptional.

Kernel
If you're packaging the kernel and need to bundle a library you are likely to be granted an exception. The kernel is allowed to bundle libraries as it cannot use user space libraries.

Copylibs
The definition of a copylib is somewhat amorphous. At its basic level, the upstream for the library intends for you to copy the source code of the library into your program, modify it to suit your needs, and then release your software with continuous, forked modifications to that source. Just because you think you're dealing with a copylib does not guarantee that you will be granted an exception. In particular, the programming practice that is common in some java, mono, and scripting language circles of copying external libraries that are otherwise from a separate upstream into the program's source and distributing them together is not allowed. Programs which bundle libraries whose upstream is dead and make bugfixes to the bundled copy is not allowed. As much as possible we want to have a single copy of a library in the distribution which everyone links to.

Some of the criteria that fesco uses to evaluate the copylib case are:


 * Does the upstream library make actual releases? If they do, then it is likely not a copylib.
 * Does upstream define what they put together as a library or as reusable code snippets that are to be modified and incorporated as source in individual packages? If the latter, it's more likely that the library is a copylib under this definition.

Modified beyond a certain extent
Modification of a library should not be the only reason given to justify a bundled copy as the two questions come up: why can't these changes go back to the upstream for the library? Why isn't this library forked and released in such a way that others can benefit from the changes as well? However, it can be one of the factors considered. To provide a solid foundation for a bundling exception you should be able to answer those two questions. An explanation that tells why the changes are only useful for the application that's bundling them, for instance.


 * Example: recoll bundles unac but unac changed the API of unac and those changes were judged to only be of use to recoll and thus the bundling was allowed.
 * Counter example: rsync bundles zlib. However, the modified zlib is useful to others as the modified zlib is necessary in order to implement the rsync protocol.  In particular, the program zsync needs to have a similarly modified zlib in order to be of use.

Standard questions
You should have answers to these standard questions before seeking an exception.


 * Has the library behaviour been modified? If the library has been modified in ways that change the API or behaviour then there may be a case for copying.  Note that fixing bugs is not grounds to copy.  If the library has not been modified (ie: it can be used verbatim in the distro) there's little chance of an exception.
 * Why haven't the changes been pushed to the upstream library? If no attempt has been made to push the changes upstream, we shouldn't be supporting people forking out of laziness.
 * Have the changes been proposed to the Fedora package maintainer for the library? In some cases it may make sense for our package to take the changes despite upstream not taking them (for instance, if upstream for the library is dead).
 * Could we make the forked version the canonical version within Fedora? For instance, if upstream for the library is dead, is the package we're working on that bundles willing to make their fork a library that others can link against?
 * Are the changes useful to consumers other than the bundling application? If so why aren't we proposing that the library be released as a fork of the upstream library?

Requirement if you bundle
Provides: bundled(zlib) = 1.1.14
 * You must note that the library has been granted an exception in a spec file comment with a link to the FESCo ticket where the exception was granted.
 * If you bundle a library, you are required to add a virtual provide to your spec file to note that you are bundling. This allows us to search for packages that may be affected by bugs or security issues in older versions of the library.  The notation should look like this:

denotes that this is a bundled library virtual provide rather than something that other packages would want to depend on. Inside the paranthesis, the binary package that provides the library is listed. (For instance, ,  ,  ,  ). The version notes which version of the library was bundled. If there's been a lot of incomplete backporting of changes from newer versions of the library, it can be hard to establish what version to use here. A very general rule of thumb is to use the oldest version that seems reasonable as the reason we're doing this is to tell when a library contains issues that have been fixed in newer upstream versions.

Packages granted exceptions

 * The kernel bundling of zlib. Since the kernel cannot use user space libraries, this is has been accepted.
 * Packages containing the following libraries are granted an exception due to these libraries being copylibs:
 * libiberty
 * egglib
 * gnulib
 * binc
 * recoll has been granted an exception for unac due to having changes that are not applicable to other applications.

Other distributions
As this is a place where we have to convince upstream that there's a problem, it's good to be able to point out that this is a problem for all distributions, not just Fedora. Here's links to other distribution's policies::


 * Debian -- http://www.debian.org/doc/debian-policy/ch-source.html#s-embeddedfiles
 * Talk given at pycon with a large section on not bundling libraries http://pycon.blip.tv/file/2072580/