From Fedora Project Wiki

Enable spec file preprocessing

Summary

This change should enable an opt-in spec file preprocessor in Fedora infrastructure for the benefit of packagers. The preprocessor allows some very neat tricks that were impossible before, for example generate changelog and release automatically from git metadata or pack the entire dist-git repository into an rpm-source tarball (effectively allowing unpacked repos to live in DistGit).

Owner

Current status

  • Targeted release: Fedora 34
  • Last updated: 2021-01-13
  • FESCo issue: #2532
  • Tracker bug: <will be assigned by the Wrangler>
  • Release notes tracker: <will be assigned by the Wrangler>

Detailed Description

There is a recently added feature into mock: the rpkg preprocessor which, if enabled, introduces an intermediate step just before srpm building. This step consists of running the spec file through a text preprocessing engine that includes an already present library of macros designed specifically for rpm spec file generation from git metadata. This library is called rpkg-macros. The macros there allow packagers to have their %changelog, Release, Version, VCS tag, or even Source fields automatically generated from dist-git repository data and metadata. The library can be easily extended in future to support more packager use-cases or even a completely new library can be developed that doesn't look at git metadata at all and instead, for example, analyses already present tarball content to render spec file based on upstream information. This doesn't mean it will happen but the framework is generic enough to support that. There is also support for user-defined macros that are loaded on-demand from a file placed alongside the package sources, maintained by packager. This feature wouldn't be enabled by this change from start but it's an example of freedom that the preprocessing framework is able to provide. Enabling this change should be very easy, basically adding:

config_opts['plugin_conf']['rpkg_preprocessor_enable'] = True

into mock configuration of Koji builders and using at least mock 2.7. Some very minor change may be also needed in Koji regarding the spec file lookup.

Even if the change is enabled on the infrastructure level like this, the packager will still need to opt-in to use the preprocessor. The opting-in is done by placing rpkg.conf file into the package top-level directory with the following content:

[rpkg]
preprocess_spec = True

When this is done by a packager, the preprocessor will be finally enabled for the given package.

Alongside, there is an ongoing work to add the preprocessor support into the rpkg python library so that a packager can easily work with the spec files containing the preprocessor (rpkg) macros: https://pagure.io/rpkg/pull-request/530

Also, the following pull requests were open to add support for preprocessing into python version of spectool and into the dnf builddep command:

The macros are supported since epel6 until the most latest Fedora (preproc, rpkg-macros, and rpm-git-tag-sort packages are needed). The spec preprocessing step in mock happens in a target chroot just before the srpm build.

The automatic changelog and release generation problem (being one of the problems that this change intends to solve) has a long history. Here are some found mailing list threads that talk about the topic:

Feedback

From the associated mailing list thread:

Why not have preprocessing support directly in rpm?

It could be probably implemented directly on rpm-level but this would mean it would be a non-optional change on syntax level. Therefore it would be a major rpm change. Implementation on upper layer allows to have this feature enabled optionally on a package-per-package basis by using rpkg.conf file, which avoids breakage of spec files that accidentally contain {{{ }}} tags. The rpkg.conf file can be in future used for other purposes than just to enable preprocessing, for example to configure sources for fedpkg tag message content, therefore its introduction isn't a burden but quite the opposite - it offers an extra flexibility for packagers.

Does it make work of proven packagers more difficult when doing mass spec changes?

Packagers should definitely know about the existence of spec file templates which will be recognized by .spec.rpkg extension. That means packagers should sed and grep all files matching *.spec* glob instead of just *.spec. If proven packagers also wanted to rpm-parse all Fedora spec files for more detailed analysis, we can extend options of the already existing tool preproc-rpmspec to provide any information that is normally provided by the rpmspec tool.

In future, a preproc macro could actually save work in cases when the change should be done in spec file part that is being automatically generated by the macro. In that case, we just need to change the macro at one place instead of changing all the spec files that would normally need the change.

Does it make work of proven packagers and relengs more difficult when doing mass rebuilds?

If scripts are used and these scripts are already updated to account for presence of spec templates, then there is no additional work. If a manual update and rebuild of a package is needed and the package uses a spec template, then the work associated with bumping release and adding a changelog entry (in case these would be also manual actions) is actually easier because one can just fedpkg tag the package and the release and the changelog entry will be automatically generated.

The packager/releng should however still check the spec file template that it uses the preproc macros for bumping release and for generating changelog because that's not automatically implied by using the preprocessor. Usage of a script (e.g. rpmdev-bumpspec) is therefore advisable even in this case because the script will be always able to pick the correct action.

So if scripts are used, there is no added difficulty. When doing changes manually, working with spec templates is easier if one also quickly checks the spec file for the rpkg macro usage.

What about packagers doing fedpkg local build?

This PR makes sure fedpkg local will work for packages using spec templates: https://pagure.io/rpkg/pull-request/530

Does it make things more difficult for our dist-git downstreams?

It depends on what exactly they do with spec files. If they e.g. directly build srpms from dist-git repos by using rpmbuild in a container then, yes, they will need to adopt the new tooling (preproc-rpmspec) or use fedpkg or rpkg, which actually becomes the recommended command-line interface to work with spec files within a git repository context. If they use mock to build a srpm or already a higher level tool based on the rpkg python library, then they have no issue.

Where exactly is the spec file pre-processed. Is it in the buildSRPMFromSCM mock, or on the Koji host?

When using mock (which Koji does), it is preprocessed in the target chroot, i.e. in the same environment where rpmbuild -bs is called afterwards to build an srpm.

Will it be possible to have the non-processed spec and any required supporting files included in the SRPM?

Yes, this will be possible. It should be enough to simply specify them as additional static rpm sources.

Does preprocessing relate in any way to the magic done by rdopkg dist-git <-> source-git translation?

There is some overlap in the goal (allow people to work with upstream sources when convenient) but the method to achieve it is slightly different. The approach suggested here is more declarative while the rdopkg's approach is more imperative.

Basically, with rdopkg, you do the upstream<->downstream conversion on the client and then just push the results to the server. With the preprocessing engine, you define the conversion in a spec file and then let the infrastructure (i.e. builders) do the work just before srpm is built.

NOTE: This question targets the possibility to host and use unpacked repos in Fedora DistGit if spec preprocessing is enabled (i.e. it is not so much about the automatic changelog and release part).

How this solution relates to other proposed solution for automatic changelog and release generation (i.e. rpmautospec or the %automacros)?

There are differences in design. Preprocessing has a disadvantage that it introduces a new syntax but it also has some nice advantages:

  • it is conceptually very simple (everyone understands the idea of preprocessing)
  • once you preprocess and generate an srpm, you get just a generic srpm to which everyone is used to (it might include the original spec file template as a source but that shouldn't make a big difference)
  • it allows for very easy integration into Fedora BuildSystem - everything stays the same, there is just one additional step (the preprocessing) added at a single place (Koji builder). It doesn't add any additional communication links (in other words "API calls") between any of the Fedora Infrastructure components, which makes the whole solution much more resilient against errors.
  • apart from solving the automatic changelog and release problem, it has further applications like enabling unpacked repos and we might find more in future.
  • it is also already supported by Copr by the SCM rpkg build method (at least the previous generation of macros is but updating it shouldn't be very difficult)
  • there is nothing hidden from packager's eyes. The spec file that he/she sees in dist-git will be the spec file that goes into the srpm after the macro expansion. But this expansion is well-defined and there are placeholders (the rpkg macros) for it present in the spec file. I.e. packagers don't need to have any implicit knowledge that something will e.g. attach something to (or strip something from) spec file in certain cases which makes the whole solution more open and transparent.

Benefit to Fedora

This change offers solution to some long-standing issues in Fedora around packaging (i.e. automatic release and changelog generation) while also offering some interesting future options (for example unpacked dist-git repos). The big advantage of this approach is that it is explicit. Spec file stays the source of truth and by looking inside one, you will be able to determine how the text will expand for a certain git repository state.

Scope

  • Proposal owners:

For the very basic support, probably a small patch in Koji is needed to be able to lookup not only .spec files but also .spec.rpkg files (the .spec.rpkg extension explicitly states that the spec file is a template). Also the rpmdevtools/rpmdev-bumpspec script should be tweaked to be compatible with spec files using the macros. There might also be some additional work needed on the pull requests currently open (rpkg python library, dnf-plugins-core, rpmdevtools).

If the pull request for rpkg python library is merged, there will be some additional work to enable reading of rpkg.conf file from git repository where a spec file is present. The configuration read from the git repository will be merged with e.g. fedpkg system-wide configuration to produce the final resulting fedpkg configuration. The git-repository rpkg.conf file is the way for a packager to enable the preprocessing.

  • Other developers:

Some optional help with rpmdevtools/rpmdev-bumpspec changes would be welcome.

  • Release engineering: #9910 (a check of an impact with Release Engineering is needed)

Enabling the rpkg_preprocessor plugin in mock config for Koji builders.

  • Policies and guidelines:

The new macro support should be mentioned or even described in the packaging guidelines. We should decide if the full power of the rpkg-macros library should be allowed from start (i.e. even unpacked repos).

  • Trademark approval: N/A (not needed for this Change)
  • Alignment with Objectives: N/A

Upgrade/compatibility impact

Because of the opt-in nature on packager side, there should be no compatibility issues.

How To Test

Once the feature is enabled, one can test it by providing the rpkg.conf file with the required content in a package repository and use some rpkg macro in the spec file: e.g.

Name: {{{ git_dir_name }}}

to generate the name of the package from the repository name (this should actually produce the original text as package names should be the same as the repository basenames).

To try it out first without committing to dist-git, one can use rpkg command-line tool from https://copr.fedorainfracloud.org/coprs/clime/rpkg-util/ or even fedpkg's koji scratch build after the work in the pyrpkg library is finished.

One can also currently use Copr's SCM "rpkg" build method where the macros are enabled but the rpkg-macros there are in version 2 whereas this change is about introducing the version 3 rpkg-macros. However, while there are some differences between v2 and v3, the idea and most of the working is the same.

User Experience

This change is intended for packagers. It should help to make a bit of their work easier and offer them some new interesting options.

There is one negative effect that Fedora spec files will no longer be directly parseable by rpmspec or rpmbuild - at least not those that use the preproc macros. Such spec file is effectively a template and needs to be translated into a valid spec file first e.g. by using preproc-rpmspec tool:

$ preproc-rpmspec input.spec.rpkg -o output.spec

The input.spec.rpkg file is a spec file template. output.spec is already a valid rpm spec file.

However, tools like spectool or dnf builddep should continue to work as of today if the PRs to add the preprocessing support into them are accepted.

The simplest way for a Fedora packager to work with the spec file template is by using fedpkg (i.e. fedpkg spec, fedpkg srpm, fedpkg mockbuild) if the rpkg patch is accepted.

Dependencies

N/A

Contingency Plan

Packagers can opt-out on individual basis by removing the rpkg.conf file or just setting the preproces_spec property to False. On infrastructure level, the rpkg_preprocessor plugin could be disabled again.

Documentation

- Mock's rpkg_preprocessor plugin

- rpkg-macros reference (the library of macros ready to be used in spec files)

Release Notes

Currently N/A