User:Tibbs/AutoProvidesAndRequiresFiltering

Summary
Depending on the release, RPM rpm permits various mechanisms to enable filtering of auto-generated requires and provides; this guideline describes how Fedora uses those mechanisms.

These guidelines are relevant:


 * MUST: Packages must not provide RPM dependency information when that information is not global in nature, or are otherwise handled (e.g. through a virtual provides system). e.g. a plugin package containing a binary shared library must not "provide" that library unless it is accessible through the system library paths.
 * MUST: When filtering automatically generated RPM dependency information, the filtering system implemented by Fedora must be used, except where there is a compelling reason to deviate from it.

However, due to limitations of the mechanism which can induce problems which are worse than the excess dependencies, Fedora 14 and EPEL6 have the additional guideline, which overrides the above two:
 * MUST NOT: For the Fedora 14 and EPEL6 method below, if the package does not fall into one of the categories specified in the Usage Section then it must not filter the provides at this time.

Rationale
The auto requires and provides system contained in RPM is quite useful; however, it often picks up "private" package capabilities that shouldn't be advertised as global, things that are "just wrong", or things prohibited by policy (e.g. deps from inside %{_docdir}).

For example:


 * Various "plugin" packages (e.g. Pidgin, Perl, Apache, KDE) are marked as "providing" private shared libraries outside the system path.
 * Files in %{_docdir} are routinely scanned, and can trigger prov/req when this is explicitly forbidden by policy.

As it stands, filtering these auto-generated requires and provides is difficult and messy at best, and horribly deep magic in many cases; with little guidance on how to do it. This feature aims to make the following tasks easy:


 * preventing files/directories from being scanned for requires (pre-scan filtering)
 * preventing files/directories from being scanned for provides (pre-scan filtering)
 * removing items from the requires stream (post-scan filtering)
 * removing items from the provides stream (post-scan filtering)

There are three different filtering mechanisms available:
 * The recommended mechanism, available beginning with rpm 4.9.0 (Fedora 15 and newer).
 * A version with some drawbacks, needed for Fedora 14 and EPEL 6.
 * An old method needed for EPEL 4 and 5, which is not covered by this document. See EPEL:Packaging.

Usage
These filtering macros can be used with any package. The only requirement is that the distribution have RPM 4.9.0 or later; currently this limits this method to Fedora 15 and later. Note that unlike the Fedora 14 and earlier method, you do not need to make use of the %filter_setup macro, but you must use %define to set the macros to the desired values.

Location of macro invocation
It's strongly recommended that these filtering macros be defined before %description, but after any other definitions. This will keep them in a consistent place across packages, and help prevent them from being mixed up with other sections.

Printing files/directories from being scanned for provides (pre-scan filtering)
The %__provides_exclude_from macro is used to specify a regular expression matching files or directories that should not be scanned for any "provides" information. You can only define it once, so if you need to match multiple locations you will need to construct a compound pattern. Note that the buildroot (i.e. %{buildroot} or $RPM_BUILD_ROOT') is stripped from the pathname before comparison.


 * XXX What regexp language is in use?
 * XXX Is there any means of or need to escape characters from macros before using them in these patterns?
 * XXX Is "macro" the proper term for these things?
 * XXX Include example of a compound pattern. Depends on the regexp language in use.
 * XXX Can the perl_default_filter bit be converted?
 * XXX %define or %global to set these?
 * XXX Is it possible to cleanly add a pattern to one of these macros, so, for example, perl_default_filter could tack on an additional pattern and not simply override whatever the user had set.

We can filter by regex: %define __provides_exclude_from %{perl_vendorarch}/.*\.so$

Or by anything matching, say, a directory: %define __provides_exclude_from %{_docdir}

Preventing files/directories from being scanned for requires (pre-scan filtering)
The %__requires_exclude_from macro is used to specify a regular expression matching files or directories that should not be scanned for any "requires" information; it does for requires what the %__provides_exclude_from macro does for provides and is defined in the same fashion.

Removing items from the provides stream (post-scan filtering)
Post-scan provides filtering is specified with the '%__provides_exclude macro. Simply set this macro to a regular expression and all matching "provides" will be removed.

For example, if we're finding that the auto-prov system is finding an incorrect provide, we can filter it: %define __provides_exclude bad-provide

Since this macro can only be defined a single time, if multiple provides must be filtered, you must construct a pattern which matches them all: %define __provides_exclude (bad-provide|another-erroneous-provide)

Removing items from the requires stream (post-scan filtering)
The %__requires_exclude macro is used to filter "requires"; it does for requires what the %__provides_exclude macro does for provides and is invoked in the same fashion.

Usage
These filtering macros MUST only be used with packages which meet one of the following criteria:
 * Noarch packages
 * Architecture specific packages with no binaries in $PATH (e.g. /bin, /usr/bin, /sbin, /sbin) or libexecdir and no system libs in libdir. This includes all of the subpackages generated from the spec file.

They are not permitted in any other cases, because the macros interfere with the "coloring" of elf32/64 executables done internally by RPM to support multilib installs.

Location of macro invocation
It's strongly recommended that these filtering macros be invoked before %description, but after any other definitions. This will keep them in a consistent place across packages, and help prevent them from being mixed up with other sections.

Preventing files/directories from being scanned for provides (pre-scan filtering)
The %filter_provides_in macro is used to define the files or directories that should not be scanned for any "provides" information. This macro may be safely invoked multiple times, and can handle regular expressions. The -P flag can be passed to specify that a PCRE is being used.

We can filter by regex: %filter_provides_in %{perl_vendorarch}/.*\.so$ %filter_provides_in -P %{perl_archlib}/(?!CORE/libperl).*\.so$

Or by anything matching, say, a directory: %filter_provides_in %{_docdir}

Preventing files/directories from being scanned for requires (pre-scan filtering)
The %filter_requires_in macro is used to define the files or directories that should not be scanned for any "requires" information; it does for requires what the %filter_provides_in macro does for provides and is invoked in the same fashion.

Removing items from the provides stream (post-scan filtering)
Post-scan provides filtering is invoked through the %filter_from_provides. This macro can be fed a sed expression to filter from the stream of auto-found provides.

For example, if we're finding that the auto-prov system is finding an incorrect provide, we can filter it:

%filter_from_provides /bad-provide/d

Note that we should always specify this in terms of a regexp.

Removing items from the requires stream (post-scan filtering)
The %filter_from_requires macro is used to filter "requires"; it does for requires what the %filter_from_provides macro does for provides and is invoked in the same fashion.

General filter setup
The %filter_setup macro must be invoked after defining any specific overrides; this macro does all the heavy lifting of implementing the filtering desired:

%filter_setup
 * 1) ... filtering defines here

These macros were not defined in EPEL5. People wanting to share one spec file with Fedora and EPEL need to conditionalize use of the macros. That can be done like this:

%{?filter_setup: %filter_provides_in %{python_sitearch}.*\.so$ %filter_setup }

Simplified macros for common cases
In some cases, the filtering of extraneous  is fairly generic to all packages which provide similar things. There are simple macros that setup filters correctly for those cases so that you can do the filtering with one line. If you need to filter a bit more than the simple macro provides, you still have the option to use the macros listed above.

Perl
Perl extension modules can be filtered using this macro:

%{?perl_default_filter}

This is equivalent to:

%filter_provides_in %{perl_vendorarch}/.*\\.so$ %filter_provides_in -P %{perl_archlib}/(?!CORE/libperl).*\\.so$ %filter_from_provides /perl(UNIVERSAL)/d; /perl(DB)/d %filter_provides_in %{_docdir} %filter_requires_in %{_docdir} %filter_setup

Pidgin plugin package
On a x86_64 machine, the pidgin-libnotify provides pidgin-libnotify.so(64bit), which it shouldn't, as this library is not inside the paths searched by the system for libraries; that is, it's a private, not global, "provides" and as such must not be exposed globally by RPM.

To filter this out, we could use:

%{?filter_setup: %filter_provides_in %{_libdir}/purple-2/.*\.so$ %filter_setup }

Arch-specific extensions to scripting languages
e.g. to ensure an arch-specific perl-* package won't provide or require things that it shouldn't, we could use an invocation as such:

%{?perl_default_filter}
 * 1) we don't want to provide private Perl extension libs

A recipe for python: %{?filter_setup: %filter_provides_in %{python_sitearch}/.*\.so$ %filter_setup }
 * 1) we don't want to provide private python extension libs

%_docdir filtering
By policy, nothing under %_docdir is allowed to either "provide" or "require" anything. We can prevent this from happening by preventing anything under %_docdir from being scanned:

%{?filter_setup: %filter_provides_in %{_docdir} %filter_requires_in %{_docdir} %filter_setup }
 * 1) we don't want to either provide or require anything from _docdir, per policy