From Fedora Project Wiki

(→‎Summary: add to Category:Packaging_guidelines_drafts)
(Add link to EPEL autoprovides filtering guidelines)
(20 intermediate revisions by 4 users not shown)
Line 1: Line 1:
[[Category:Packaging_guidelines_drafts]]
== Summary ==
== Summary ==


RPM has no general or standard mechanism to enable filtering of auto-generated requires and provides; this guideline describes how Fedora has implemented such a system.
{{admon/note|EPEL Differences|As of rpm-4.9 (Fedora 15), rpm has a standard method to enable filtering.  This page documents that.  EPEL5 and 6 do not have a recent enough version of rpm to follow these guidelines.  See [EPEL:Packaging_Autoprovides_and_Requires_Filtering] if your package is to be built there as well.}}


* '''MUST:''' Packages must not provide RPM dependency information when that information is not global in nature, or are otherwise handled (e.g. through a virtual provides system).  e.g. a plugin package containing a binary shared library must not "provide" that library unless it is accessible through the system library paths.
The auto requires and provides system contained in RPM is quite useful; however, it sometimes picks up "private" package capabilities that shouldn't be advertised as global, things that are "just wrong", or things prohibited by policy (e.g. deps from inside <code>%{_docdir}</code>).
* '''MUST:''' When filtering automatically generated RPM dependency information, the filtering system implemented by Fedora must be used, except where there is a compelling reason to deviate from it.


== Rationale ==
For example:


RPM has no general mechanism to enable filtering of auto-generated requires and provides; this feature aims to implement one.  
* Various "plugin" packages (e.g. Pidgin, Perl, Apache, KDE) are marked as "providing" private shared libraries outside the system path.
* Files in <code>%{_docdir}</code> are routinely scanned, and can trigger prov/req when this is explicitly forbidden by policy.


The auto requires and provides system contained in RPM is quite useful; however, it often picks up "private" package capabilities that shouldn't be advertised as global, things that are "just wrong", or things prohibited by policy (e.g. deps from inside %{_docdir}).


For example:
This Guideline describes how to filter provides and requires on Fedora.


* Various "plugin" packages (e.g. Pidgin, Perl, Apache, KDE) are marked as "providing" private shared libraries outside the system path.
* '''MUST:''' Packages must not provide RPM dependency information when that information is not global in nature, or are otherwise handled (e.g. through a virtual provides system).  e.g. a plugin package containing a binary shared library must not "provide" that library unless it is accessible through the system library paths.
* Files in %{_docdir} are routinely scanned, and can trigger prov/req when this is explicitly forbidden by policy.
* '''MUST:''' When filtering automatically generated RPM dependency information, the filtering system implemented by Fedora must be used, except where there is a compelling reason to deviate from it.


As it stands, filtering these auto-generated requires and provides is difficult and messy at best, and horribly deep magic in many cases; with little guidance on how to do it.{{ref|1}}  This feature aims to make the following tasks easy:
== Usage ==


* preventing files/directories from being scanned for requires (pre-scan filtering)
{{admon/warning|These macros are not cumulative|With the macros defined here, the last definition is the one that is used.  They '''replace''' whatever was defined before.  This is a change from the old macros which added to the filter instead.  Be careful not to lose parts of your macro definition when porting from the old style to the new ones.}}
* preventing files/directories from being scanned for provides (pre-scan filtering)
* removing items from the requires stream (post-scan filtering)
* removing items from the provides stream (post-scan filtering)


'''Macros defining the filtering system: [http://fedorapeople.org/~cweyl/macros.filtering macros.filtering]'''
=== Location of macro invocation ===


== Examples ==
It's strongly recommended that these filtering macros be invoked before %description, but after any other definitions.  This will keep them in a consistent place across packages, and help prevent them from being mixed up with other sections.


A brief comparison of [[Features/BetterRpmAutoReqProvFiltering/OtherFilteringSystems|other auto req/prov filtering systems]].
=== Regular Expression Variant ===


=== Pidigin plugin package ===
These filters use regular expressions.  The regular expression variant used for these filters follow the POSIX.2 regular expression standard (see man regex(7) ).  In this variant, the literal characters <code>^.[$()|*+?{</code> need to be backslash escaped. Because rpm interprets backslashes as part of its parsing of spec files, you will need to use a '''double backslash''' for any escapes.  A literal backslash ("<code>\</code>") is represented by four backslashes.


On a x86_64 machine, the pidgin-libnotify provides pidgin-libnotify.so()(64bit), which it shouldn't, as this library is not inside the paths searched by the system for libraries; that is, it's a private, not global, "provides" and as such must not be exposed globally by RPM.
The regex engine is only passed the final string, after rpm macro expansion. So you can't use unescaped data via rpm macros.  For instance, if you generate a list of files to match in a macro and that list contains <code>libfoo.so</code> you'll have to use <code>libfoo\\.so</code> to escape the ("<code>.</code>"). Example:
 
To filter this out, we could use:


<pre>
<pre>
%filter_provides_in %{_libdir}/purple-2/.*\\.so$
%global to_exclude libfoo\\.so
%filter_setup
%global __requires_exclude_from ^%{_datadir}/%{to_exclude}$
</pre>
</pre>


=== Arch-specific perl-* package ===
=== Preventing files/directories from being scanned for deps (pre-scan filtering) ===


e.g. to ensure an arch-specific perl-* package won't provide or require things that it shouldn't, we could use an invocation as such:
The macros <code>%__requires_exclude_from</code> and <code>%__provides_exclude_from</code> can be defined in a spec file to keep the dependency generator from scanning specific files or directories for deps.  These macros should be defined with a regular expression that matches all of the directories or files.  For instance:


<pre>
<pre>
# we don't want to provide private Perl extension libs
# Do not check any files in docdir for requires
%filter_provides_in %{perl_vendorarch}/.*\\.so$
%global __requires_exclude_from ^%{_docdir}/.*$
%filter_provides_in -P %{perl_archlib}/(?!CORE/libperl).*\\.so$  


# actually set up the filtering
# Do not check .so files in the python_sitelib directory
%filter_setup
# or any files in the application's directory for provides
%global __provides_exclude_from ^(%{python_sitelib}/.*\\.so|%{_datadir}/myapp/.*)$
</pre>
</pre>


=== %_docdir filtering ===
Note that this macro replaces the <code>%filter_provides_in</code> macro from the old filtering guidelines but it does not do the same thing.  In particular:
* The old macro could be invoked multiple times.  This one will only use the regex defined last.
* The old macro advised against anchoring the beginning of the regex (Using "^").  This macro recommends anchoring as it doesn't suffer from the compatibility problems of the old one.
* With the old macro it was common to specify a directory name to match everything in a directory recursively.  With the new macro you may need to specify <code>.*</code> because you should be anchoring your regular expressions.
 
=== Filtering provides and requires after scanning ===


By policy, nothing under %_docdir is allowed to either "provide" or "require" anythingWe can prevent this from happening by preventing anything under %_docdir from being scanned:
In addition to preventing rpm from scanning files and directories for automatic dependency generation you can also tell rpm to discard a discovered dependency before it records the dependency in the rpm metadata.  Use <code>__requires_exclude</code> and <code>__provides_exclude</code> for this.  These macros should be defined as regular expressions.  If an entry that rpm's automatic dependency generator created matches the regular expression then it will be filtered out of the requires or providesFor example:


<pre>
<pre>
# we don't want to either provide or require anything from _docdir, per policy
# This might be useful if plugins are being picked up by the dependency generator
%filter_provides_in %{_docdir}
%global __provides_exclude ^libfoo-plugin\\.so.*$
%filter_requires_in %{_docdir}


# actually set up the filtering
# Something like this could be used to prevent excess deps from an
%filter_setup
# example python script in %doc
%global __requires_exclude ^/usr/bin/python$
</pre>
</pre>


== Usage ==
These macros serves a similar purpose to the old <code>%filter_from_provides</code> macro but it has a different implementation.  In particular, that macro took sed expressions whereas this one needs a regular expression.


{{admon/warning|Beware of Multilib|Be careful of using these macros in a multilib situation, as they may interfere with the "coloring" of elf32/64 executables done internally by RPM to support multilib installs.}}
=== Simplified macros for common cases ===


=== Location of macro invocation ===
In some cases, the filtering of extraneous <code>Provides:</code> is fairly generic to all packages which provide similar things.  There are simple macros that setup filters correctly for those cases so that you can do the filtering with one line.  If you need to filter a bit more than the simple macro provides, you still have the option to use the macros listed above.


It's strongly recommended that these filtering macros be invoked before %description, but after any other definitions.  This will keep them in a consistent place across packages, and help prevent them from being mixed up with other sections.
==== Perl ====


=== Preventing files/directories from being scanned for provides (pre-scan filtering) ===
Perl extension modules can be filtered using this macro:


The '''%filter_provides_in''' macro is used to define the files or directories that should not be scanned for any "provides" information.  This macro may be safely invoked multiple times, and can handle regular expressions.  The -P flag can be passed to specify that a PCRE is being used.
<pre>
%{?perl_default_filter}
</pre>
 
This is equivalent to:


We can filter by regex:
<pre>
<pre>
%filter_provides_in %{perl_vendorarch}/.*\\.so$  
%global __provides_exclude_from %{perl_vendorarch}/auto/.*\\.so$|%{perl_archlib}/.*\\.so$|%{_docdir}
%filter_provides_in -P %{perl_archlib}/(?!CORE/libperl).*\\.so$  
%global __requires_exclude_from %{_docdir}
%global __provides_exclude perl\\(VMS|perl\\(Win32|perl\\(DB\\)|perl\\(UNIVERSAL\\)
%global __requires_exclude perl\\(VMS|perl\\(Win32
</pre>
</pre>


Or by anything matching, say, a directory:
If you want to use both <code>%perl_default_filter</code> and customized <code>%__provides_exclude*</code> or <code>%__requires_exclude*</code> macros be sure to use <code>%perl_default_filter</code> first and then customize it (<code>%perl_default_filter</code> overwrites what was previously set in the <code>%__provides_exclude*</code> and <code>%__requires_exclude*</code> macros.  Also be sure your customizations capture the original regex setup by <code>%perl_default_filter</code>.  For example:
 
<pre>
<pre>
%filter_provides_in %{_docdir}
%{?perl_default_filter}
%global __requires_exclude perl\\(VMS|perl\\(Win32|my_additional_pattern
</pre>
</pre>


=== Preventing files/directories from being scanned for requires (pre-scan filtering) ===
{{admon/note|Copy and paste strategy is a recognized tradeoff|This copy and paste is a tradeoff.  If the perl macro changes in simple ways (adding an additional pattern to the list of exclusions), you would need to update your spec file if you want to pick those up.  But the copy and paste protects you from more drastic changes to the perl macros that may not work with how you attempt to add a new pattern.  It also reduces the complexity of trying to anticipate the [https://fedorahosted.org/fpc/ticket/76#comment:5 errors that could be introduced] if the perl macros change.  Reducing the complexity reduces errors due to [http://lists.rpm.org/pipermail/rpm-list/2013-January/001359.html misunderstanding or mistyping] the means of handling those potential errors.}}
 
== Examples ==
 
 
=== Pidgin plugin package ===
 
On a x86_64 machine, the pidgin-libnotify provides <code>pidgin-libnotify.so()(64bit)</code> which it shouldn't as this library is not inside the paths searched by the system for libraries.  It's a private, not global, "provides" and as such must not be exposed globally by RPM.
 
To filter this out, we could use:


The '''%filter_requires_in''' macro is used to define the files or directories that should not be scanned for any "requires" information; it does for requires what the %filter_provides_in macro does for provides and is invoked in the same fashion.
<pre>
%global __provides_filter_from ^%{_libdir}/purple-2/.*\\.so$
</pre>


=== Removing items from the provides stream (post-scan filtering) ===
=== Private Libraries ===


Post-scan provides filtering is invoked through the '''%filter_from_provides'''.  This macro can be fed PCRE's to filter from the stream of auto-found provides.
At this time, filtering of private libraries is non-trivial.  This is because the symbols you want to filter from the private libraries are usually required by the public applications that the package ships.  In order to filter, you need to find out what symbols rpm is extracting for the private library and then remove those in both <code>%__provides_exclude</code> and <code>%__requires_exclude</code>.


For example, if we're finding that the auto-prov system is finding an incorrect provide, we can filter it:
As an example, pretend you are packaging an application foo that creates <code>%{_libdir}/foo/libprivate.so</code> that you want to filter and <code>%{_bindir}/foobar</code> that requires that private library.  You could:


<ol>
<li> First build the rpm: <code>$ rpmbuild -ba foo.spec</code></li>
<li> then determine what provides rpm decided for the private library: <pre>$ rpm -qp foo-1.0-1.x86_64.rpm</code>
<pre>libprivate.so()(64bit) 
foo = 1.0-1.fc19
foo(x86-64) = 1.0-1.fc19
</pre>
</li>
<li>See that "<code>libprivate.so()(64bit)</code>" appears to be the only symbol that rpm extracted for this package.  Note that on 32 bit, the provides will be <code>libprivate.so</code> so your regex needs to capture both.</li>
<li>Add the excludes to the spec file for both requires and provides:
<pre>
<pre>
%filter_from_provides /bad-provide/d
[...]
%global _privatelibs libprivate[.]so.*
%global __provides_exclude ^(%{_privatelibs})$
%global __requires_exclude ^(%{_privatelibs})$
[...]
</pre>
</pre>
</li>
</ol>
You can take a look at a [http://lists.fedoraproject.org/pipermail/devel/2012-June/169190.html more complex example] on the mailing list. This can be a pain to maintain if the upstream changes the names of its private libraries but it is the only way to deal with this at present.  There may be a better means in [http://lists.rpm.org/pipermail/rpm-maint/2013-January/003349.html the future] but there are no solid plans on when those might be coded as of yet..


Note that we should always specify this in terms of a regexp.
=== Arch-specific extensions to scripting languages ===
 
e.g. to ensure an arch-specific perl-* package won't provide or require things that it shouldn't, we could use an invocation as such:


=== Removing items from the requires stream (post-scan filtering) ===
<pre>
# we don't want to provide private Perl extension libs
%{?perl_default_filter}
</pre>


The '''%filter_from_requires''' macro is used to filter "requires"; it does for requires what the %filter_from_provides macro does for provides and is invoked in the same fashion.
A recipe for python:
<pre>
# we don't want to provide private python extension libs in either the python2 or python3 dirs
%global __provides_exclude_from ^(%{python_sitearch}|%{python3_sitearch})/.*\\.so$
</pre>


=== General filter setup ===
=== %_docdir filtering ===


The '''%filter_setup''' macro must be invoked after defining any specific overrides; this macro does all the heavy lifting of implementing the filtering desired:
By policy, nothing under <code>%_docdir</code> is allowed to either "provide" or "require" anything.  We can prevent this from happening by preventing anything under <code>%_docdir</code> from being scanned:


<pre>
<pre>
# ... filtering defines here
# we don't want to either provide or require anything from _docdir, per policy
%filter_setup
%global __provides_exclude_from ^%{_docdir}/.*$
%global __requires_exclude_from ^%{_docdir}/.*$
</pre>
</pre>
== Additional Information ==
Additional information about rpm-4.9's dependency generator can be found here: http://rpm.org/wiki/PackagerDocs/DependencyGenerator
[[Category:Packaging_guidelines]]

Revision as of 17:29, 27 March 2013

Summary

Note.png
EPEL Differences
As of rpm-4.9 (Fedora 15), rpm has a standard method to enable filtering. This page documents that. EPEL5 and 6 do not have a recent enough version of rpm to follow these guidelines. See [EPEL:Packaging_Autoprovides_and_Requires_Filtering] if your package is to be built there as well.

The auto requires and provides system contained in RPM is quite useful; however, it sometimes picks up "private" package capabilities that shouldn't be advertised as global, things that are "just wrong", or things prohibited by policy (e.g. deps from inside %{_docdir}).

For example:

  • Various "plugin" packages (e.g. Pidgin, Perl, Apache, KDE) are marked as "providing" private shared libraries outside the system path.
  • Files in %{_docdir} are routinely scanned, and can trigger prov/req when this is explicitly forbidden by policy.


This Guideline describes how to filter provides and requires on Fedora.

  • MUST: Packages must not provide RPM dependency information when that information is not global in nature, or are otherwise handled (e.g. through a virtual provides system). e.g. a plugin package containing a binary shared library must not "provide" that library unless it is accessible through the system library paths.
  • MUST: When filtering automatically generated RPM dependency information, the filtering system implemented by Fedora must be used, except where there is a compelling reason to deviate from it.

Usage

Warning.png
These macros are not cumulative
With the macros defined here, the last definition is the one that is used. They replace whatever was defined before. This is a change from the old macros which added to the filter instead. Be careful not to lose parts of your macro definition when porting from the old style to the new ones.

Location of macro invocation

It's strongly recommended that these filtering macros be invoked before %description, but after any other definitions. This will keep them in a consistent place across packages, and help prevent them from being mixed up with other sections.

Regular Expression Variant

These filters use regular expressions. The regular expression variant used for these filters follow the POSIX.2 regular expression standard (see man regex(7) ). In this variant, the literal characters ^.[$()|*+?{ need to be backslash escaped. Because rpm interprets backslashes as part of its parsing of spec files, you will need to use a double backslash for any escapes. A literal backslash ("\") is represented by four backslashes.

The regex engine is only passed the final string, after rpm macro expansion. So you can't use unescaped data via rpm macros. For instance, if you generate a list of files to match in a macro and that list contains libfoo.so you'll have to use libfoo\\.so to escape the ("."). Example:

%global to_exclude libfoo\\.so
%global __requires_exclude_from ^%{_datadir}/%{to_exclude}$

Preventing files/directories from being scanned for deps (pre-scan filtering)

The macros %__requires_exclude_from and %__provides_exclude_from can be defined in a spec file to keep the dependency generator from scanning specific files or directories for deps. These macros should be defined with a regular expression that matches all of the directories or files. For instance:

# Do not check any files in docdir for requires
%global __requires_exclude_from ^%{_docdir}/.*$

# Do not check .so files in the python_sitelib directory
# or any files in the application's directory for provides
%global __provides_exclude_from ^(%{python_sitelib}/.*\\.so|%{_datadir}/myapp/.*)$

Note that this macro replaces the %filter_provides_in macro from the old filtering guidelines but it does not do the same thing. In particular:

  • The old macro could be invoked multiple times. This one will only use the regex defined last.
  • The old macro advised against anchoring the beginning of the regex (Using "^"). This macro recommends anchoring as it doesn't suffer from the compatibility problems of the old one.
  • With the old macro it was common to specify a directory name to match everything in a directory recursively. With the new macro you may need to specify .* because you should be anchoring your regular expressions.

Filtering provides and requires after scanning

In addition to preventing rpm from scanning files and directories for automatic dependency generation you can also tell rpm to discard a discovered dependency before it records the dependency in the rpm metadata. Use __requires_exclude and __provides_exclude for this. These macros should be defined as regular expressions. If an entry that rpm's automatic dependency generator created matches the regular expression then it will be filtered out of the requires or provides. For example:

# This might be useful if plugins are being picked up by the dependency generator
%global __provides_exclude ^libfoo-plugin\\.so.*$

# Something like this could be used to prevent excess deps from an
# example python script in %doc
%global __requires_exclude ^/usr/bin/python$

These macros serves a similar purpose to the old %filter_from_provides macro but it has a different implementation. In particular, that macro took sed expressions whereas this one needs a regular expression.

Simplified macros for common cases

In some cases, the filtering of extraneous Provides: is fairly generic to all packages which provide similar things. There are simple macros that setup filters correctly for those cases so that you can do the filtering with one line. If you need to filter a bit more than the simple macro provides, you still have the option to use the macros listed above.

Perl

Perl extension modules can be filtered using this macro:

%{?perl_default_filter}

This is equivalent to:

%global __provides_exclude_from %{perl_vendorarch}/auto/.*\\.so$|%{perl_archlib}/.*\\.so$|%{_docdir}
%global __requires_exclude_from %{_docdir}
%global __provides_exclude perl\\(VMS|perl\\(Win32|perl\\(DB\\)|perl\\(UNIVERSAL\\)
%global __requires_exclude perl\\(VMS|perl\\(Win32

If you want to use both %perl_default_filter and customized %__provides_exclude* or %__requires_exclude* macros be sure to use %perl_default_filter first and then customize it (%perl_default_filter overwrites what was previously set in the %__provides_exclude* and %__requires_exclude* macros. Also be sure your customizations capture the original regex setup by %perl_default_filter. For example:

%{?perl_default_filter}
%global __requires_exclude perl\\(VMS|perl\\(Win32|my_additional_pattern
Note.png
Copy and paste strategy is a recognized tradeoff
This copy and paste is a tradeoff. If the perl macro changes in simple ways (adding an additional pattern to the list of exclusions), you would need to update your spec file if you want to pick those up. But the copy and paste protects you from more drastic changes to the perl macros that may not work with how you attempt to add a new pattern. It also reduces the complexity of trying to anticipate the errors that could be introduced if the perl macros change. Reducing the complexity reduces errors due to misunderstanding or mistyping the means of handling those potential errors.

Examples

Pidgin plugin package

On a x86_64 machine, the pidgin-libnotify provides pidgin-libnotify.so()(64bit) which it shouldn't as this library is not inside the paths searched by the system for libraries. It's a private, not global, "provides" and as such must not be exposed globally by RPM.

To filter this out, we could use:

%global __provides_filter_from ^%{_libdir}/purple-2/.*\\.so$

Private Libraries

At this time, filtering of private libraries is non-trivial. This is because the symbols you want to filter from the private libraries are usually required by the public applications that the package ships. In order to filter, you need to find out what symbols rpm is extracting for the private library and then remove those in both %__provides_exclude and %__requires_exclude.

As an example, pretend you are packaging an application foo that creates %{_libdir}/foo/libprivate.so that you want to filter and %{_bindir}/foobar that requires that private library. You could:

  1. First build the rpm: $ rpmbuild -ba foo.spec
  2. then determine what provides rpm decided for the private library:
    $ rpm -qp foo-1.0-1.x86_64.rpm</code>
    <pre>libprivate.so()(64bit)  
    foo = 1.0-1.fc19
    foo(x86-64) = 1.0-1.fc19
    
  3. See that "libprivate.so()(64bit)" appears to be the only symbol that rpm extracted for this package. Note that on 32 bit, the provides will be libprivate.so so your regex needs to capture both.
  4. Add the excludes to the spec file for both requires and provides:
    [...]
    %global _privatelibs libprivate[.]so.*
    %global __provides_exclude ^(%{_privatelibs})$
    %global __requires_exclude ^(%{_privatelibs})$
    [...]
    

You can take a look at a more complex example on the mailing list. This can be a pain to maintain if the upstream changes the names of its private libraries but it is the only way to deal with this at present. There may be a better means in the future but there are no solid plans on when those might be coded as of yet..

Arch-specific extensions to scripting languages

e.g. to ensure an arch-specific perl-* package won't provide or require things that it shouldn't, we could use an invocation as such:

# we don't want to provide private Perl extension libs
%{?perl_default_filter}

A recipe for python:

# we don't want to provide private python extension libs in either the python2 or python3 dirs
%global __provides_exclude_from ^(%{python_sitearch}|%{python3_sitearch})/.*\\.so$

%_docdir filtering

By policy, nothing under %_docdir is allowed to either "provide" or "require" anything. We can prevent this from happening by preventing anything under %_docdir from being scanned:

# we don't want to either provide or require anything from _docdir, per policy
%global __provides_exclude_from ^%{_docdir}/.*$
%global __requires_exclude_from ^%{_docdir}/.*$

Additional Information

Additional information about rpm-4.9's dependency generator can be found here: http://rpm.org/wiki/PackagerDocs/DependencyGenerator