From Fedora Project Wiki
m (Fix arm name)
Line 60: Line 60:
==== A Long Time Ago in the Fedora Galaxy ====
==== A Long Time Ago in the Fedora Galaxy ====


Many releases ago, when Fedora hasn't built for power and arm yet, the Python maintainers mapped the Python "platform triplet" to <code>%{_arch}-linux%{_gnu}</code>. This worked. For example, on x86_64, this is <code>x86_64-linux-gnu</code> on Fedora and this is consistent with the "platform triplet" used in filenames in upstream.
Many releases ago, when Fedora wasn't being built for power and arm yet, the Python maintainers mapped the Python "platform triplet" to <code>%{_arch}-linux%{_gnu}</code>. This worked. For example, on x86_64, this is <code>x86_64-linux-gnu</code> on Fedora and this is consistent with the "platform triplet" used in filenames in upstream.


==== The Phantom Technical Debt ====
==== The Phantom Technical Debt ====

Revision as of 11:05, 26 August 2020


Python Upstream Architecture Names

Summary

Use CPython upstream architecture naming in Fedora's Python ecosystem (mostly in filenames) instead of the previously patched Fedora names. For example, have /usr/lib64/python3.9/lib-dynload/array.cpython-39-powerpc64le-linux-gnu.so instead of /usr/lib64/python3.9/lib-dynload/array.cpython-39-ppc64le-linux-gnu.so. This makes packaging of Python itself a tad trickier, but it moves Fedora's Python closer to upstream and solves interoperability problems with ppc64le manylinux wheels. The change has impact only on ppc64le and armv7hl (considering the architectures built by koji.fedoraproject.org). Packages assuming the filenames always contain %{_arch}-linux%{_gnu} will need to be adapted.

Owner

Current status

  • Targeted release: Fedora 34
  • Last updated: 2020-08-26
  • FESCo issue: <will be assigned by the Wrangler>
  • Tracker bug: <will be assigned by the Wrangler>
  • Release notes tracker: <will be assigned by the Wrangler>

Detailed Description

The Saga

A Long Time Ago in the Fedora Galaxy

Many releases ago, when Fedora wasn't being built for power and arm yet, the Python maintainers mapped the Python "platform triplet" to %{_arch}-linux%{_gnu}. This worked. For example, on x86_64, this is x86_64-linux-gnu on Fedora and this is consistent with the "platform triplet" used in filenames in upstream.

The Phantom Technical Debt

Later around the year 2015, as more architectures were added, Python build scripts were patched to use "the Fedora's architecture names":

At the time, that was a reasonable decision: the idea of cross-Linux builds was sci-fi, and Fedora was not trying to stay close to upstream as it is now (we had around 60 patches; today we're down to around 6).

Rise of the Manylinux Wheels

In the meantime, cross-Linux builds become a thing. The manylinux1 standard was created in 2016, allowing to build Python wheels with compiled extension modules on one Linux platform and ship them to many. The first manylinux version only supported x86_64 and i686 and hence it was not impacted by Fedora's patching decisions.

The manylinux standard arguably made the upstream Python packaging ecosystem a much nicer palce. Installing packages with compiled extension modules was no longer such a pain. One could just run pip install numpy and not worry about a disturbing lack of a Fortran compiler. For that reason, manylinux wheels become widely adopted by the most popular projects.

A New Architecture

With the third manylinux version -- manylinux2014 (created in 2019, named after the oldest Linux it supports -- CentOS 7), support for more architectures was introduced: x86_64, i686, aarch64, armv7l, ppc64, ppc64le, s390x. The adaption of new architectures is somehow slow, because the official manylinux2014 containers only currently (August 2020) exist for x86_64, i686, aarch64, ppc64le and s390x.

Revenge of the Patches

We have discovered a problem with the ppc64le manylinux2014 wheels: The CentOS 7 manylinux2014 container images ship upstream Python without RHEL/CentOS/EPEL patches. When an extension module it built there, it is named with an upstream named suffix: .cpython-XY(m)-powerpc64le-linux-gnu.so. The wheel is installable on Fedora (with Fedora's patches), but the module won't (even be considered for) import, because Fedora's Pythons expect the extension to be .cpython-XY(m)-ppc64le-linux-gnu.so.

In theory, we have the same problem on armv7hl, but there are no manylinux2014 containers available for that platform, so there are no such wheels out there (known to us).

The same problem also exists the other way around, albeit it's arguably less severe. It is possible to build manylinux wheels on (some version of) Fedora or EL (using the Python from the distribution). However extension modules from such ppc64le wheels won't import on other Linux distributions.

The Workaround Awakens

To allow importing extension modules from ppc64le manylinux wheels, we have patched Pythons (3.5+) in Fedora to consider both "Fedora's" and upstream platform triplets when importing extension modules. This workarounds works well for users installing manylinux wheels on Fedora, but does not solve the problem when building the wheels on Fedora.

The Change

With this change proposal, we plan to switch to use the upstream architecture names and keep the workaround to preserve backwards compatibility. When we do that the following will happen:

  1. The Python standard library extension module suffixes will change to .cpython-39-powerpc64le-linux-gnu.so and .cpython-39-arm-linux-gnueabihf.so. Python will still import extension modules with the legacy suffixes .cpython-39-ppc64le-linux-gnu.so and .cpython-39-arm-linux-gnu.so. Other architectures not built by koji.fedoraproject.org will also be renamed, see the pull request for a complete regex. This will happen for Python 3.5, 3.6, 3.7, 3.8 and 3.9.
  2. The newly built Python packages with extension modules will also change the suffixes. Packages that assume the platform triplet is always %{_arch}-linux%{_gnu} (e.g. in the %files section) will need to be adapted (see the New Macros section). A mix of legacy and upstream suffixes will co-exist and work together.
  3. When safe, we will drop the workaround to support the legacy names. For example, when we initially package Python 3.10, it will be packaged without the workaround. On the other hand, older Python versions might never be able to drop it, because users will carry their own built extension modules from previous releases.

New Macros

For packagers' convenience we will add 2 new Python macros:

%python3_ext_suffix

Defined as:

%python3_ext_suffix %(%{__python3} -Ic "import sysconfig; print(sysconfig.get_config_var('EXT_SUFFIX'))")

Values will be:

  • .cpython-39-x86_64-linux-gnu.so
  • .cpython-39-powerpc64le-linux-gnu.so on Fedora 34+ / .cpython-39-ppc64le-linux-gnu.so on older Fedoras
  • .cpython-39-arm-linux-gnueabihf.so on Fedora 34+ / .cpython-39-arm-linux-gnu.so on older Fedoras
  • etc.

%python3_multiarch

Defined as:

%python3_multiarch %(%{__python3} -Ic "import sysconfig; print(sysconfig.get_config_var('MULTIARCH'))")

Values will be:

  • x86_64-linux-gnu
  • powerpc64le-linux-gnu / ppc64le-linux-gnu
  • arm-linux-gnueabihf / arm-linux-gnu
  • etc.

Both macros will be backported to stable Fedoras and EPEL 7+ and will have the corresponding %python_ variant (but bare in mind that for Python smaller than 3.6, they might return unexpected or empty results).

Feedback

Benefit to Fedora

Users of ppc64le and armv7hl Fedora (and future RHEL) will have a closer-to-upstream Python experience and will no longer suffer from compatibility issues when they install or build manylinux wheels. The upstream-downstream balance will be restored.

Scope

  • Other developers: Mostly nothing, adapt the %files section if needed
  • Release engineering: a check of an impact with Release Engineering is not needed
  • Policies and guidelines: not needed for this Change
  • Trademark approval: not needed for this Change
  • Alignment with Objectives: no

Upgrade/compatibility impact

No significant user visible upgrade/compatibility problem is anticipated. Filenames will be different, but the old filenames are still supported. Scripts that hardcode filename assumptions might break.

How To Test

On ppc64le, try to install a manylinux wheel and import from it. It should work on any Python ≥ 3.5. E.g.:

pip install simple-manylinux-demo
python -c 'from dummyextension import extension'

On ppc64le, try to build a manylinux wheel and import from it on another Linux. It should work on any Python ≥ 3.5. E.g.:

pip wheel .  # on some project with extension module
auditwheel repair ...whl
wormhole send ...whl # or any other way

On another ppc64le Linux (such as Debian or openSUSE):

wormhole receive ...
pip install ...whl
python -c 'from ... import ...'

You can also build a regular (non-manylinux) wheel on Fedora 33/32 and install and import it on Fedora 34. It should work. The other way around will most likely also work, unless Fedora 34 has an incompatible glibc update.

User Experience

Users of ppc64le and armv7hl Fedora (and future RHEL) will have a closer-to-upstream Python experience and will no longer suffer from compatibility issues when they install or build manylinux wheels.

Dependencies

No known dependencies. May the force be with us.

Contingency Plan

  • Contingency mechanism: Revert the change and rebuild all affected packages.
  • Contingency deadline: Soft before the mass rebuild, so we could leverage it for the revert-rebuilds. Hard before the beta freeze.
  • Blocks release? No
  • Blocks product? No

Documentation

This page is the documentation.

Release Notes