Standard Discovery, Packaging, Invocation of Integration Tests via a distgit control file

This is a proposal
Feedback is more than welcome. There's a discussion tab above.

First see the Terminogy division of Responsibilities and Requirements

Detailed Description

This standard interface describes how to discover, stage and invoke tests. It is important to cleanly separate implementation details of the testing system from the test suite and its framework. It is also important to allow packagers to locally and manually invoke a test suite.

Discovery

The description of the tests are stored in the dist-git repos in the tests/control file. The control file follows the syntax described in the DEP 8 specification.

Here is an example of a control file to describe 2 tests for systemd:

Tests: timedated, hostnamed
Depends: systemd,
  acl,
  tzdata,
Restrictions: needs-root, isolation-container

In the example, it references 2 tests (timedated and hostnamed) with the needed packages to install and the features needed on the system under test (root access and container isolation in the example).

The tests themselves are described in the files named from the Tests field (timedated and hostnamed in the example) and they need to be executable. They can be programmed in any language (shell, python, ansible...).

Functional tests that are included in the source tree of the package could be packaged and driven through the test scripts.

Some tests could be reused from the Debian/Ubuntu packages allowing more collaboration with these Linux distributions. They have more than 6000 packages with tests.

Staging

The system under test needs to be selected according the Restrictions field. This will allow to run tests on light setups to full systems: from chroot, schroot, container, ssh accessible system (vm or bare metal)...

Invocation

We have 2 options: port the autopkgtest tool to support rpm/dnf (a partial port is already available) or write a program that reads the tests/control file, copy the needed the files into the system under test and run the corresponding test scripts.

Here is an invocation of autopkgtest to run test timedated on a remote system 192.168.122.215 launched from a Fedora 25 system:

$ autopkgtest -B --test-name timedated $PWD -- ssh --capability isolation-machine --capability root-on-testbed -H 192.168.122.215 -l user
autopkgtest [21:16:18]: host localhost.localdomain; command line: /home/fred/external/autopkgtest/runner/autopkgtest -B --test-name timedated /home/fred/external/systemd -- ssh --capability isolation-machine --capability root-on-testbed -H 192.168.122.215 -l user
autopkgtest [21:16:19]: testbed architecture: x86_64
autopkgtest [21:16:20]: testbed running kernel: Linux 4.11.0-0.rc5.git4.1.fc27.x86_64 #1 SMP Fri Apr 7 14:30:32 UTC 2017
autopkgtest [21:16:20]: @@@@@@@@@@@@@@@@@@@@ built-tree /home/fred/external/systemd
autopkgtest [21:16:20]: test timedated: preparing testbed
autopkgtest [21:16:20]: test timedated: publishing
Python detected LC_CTYPE=C: LC_ALL & LANG coerced to C.UTF-8 (set another locale or PYTHONCOERCECLOCALE=0 to disable this locale coercion behaviour).
Last metadata expiration check: 1:32:45 ago on Thu Apr 13 20:43:35 2017 MSK.
Package systemd-233-2.fc27.x86_64 is already installed, skipping.
Package acl-2.2.52-13.fc26.x86_64 is already installed, skipping.
Package tzdata-2017b-1.fc27.noarch is already installed, skipping.
Dependencies resolved.
Nothing to do.
Complete!
autopkgtest [21:16:25]: test timedated: [-----------------------
original tz: Europe/Moscow
timedatectl works
change timezone
reset timezone to original
autopkgtest [21:16:26]: test timedated: -----------------------]
autopkgtest [21:16:27]: test timedated:  - - - - - - - - - - results - - - - - - - - - -
timedated            PASS
autopkgtest [21:16:27]: @@@@@@@@@@@@@@@@@@@@ summary
timedated            PASS

autopkgtest can also invoke tests locally using the null target but at the launcher own risks.

Another interesting feature is to be able to control reboot multiple times during the tests.

Examples

Here is an extract of the timedated test from the systemd example:

#!/bin/sh
set -e

. `dirname $0`/assert.sh

ORIG_TZ=$(timedatectl | grep -F 'Time zone'|sed -e 's@.*: @@' -e 's@ .*@@')
echo "original tz: $ORIG_TZ"

echo 'timedatectl works'
assert_in "Local time:" "`timedatectl --no-pager`"

echo 'change timezone'
assert_eq "`timedatectl --no-pager set-timezone Europe/Moscow 2>&1`" ""
assert_eq "`readlink /etc/localtime | sed 's#^.*zoneinfo/##'`" "Europe/Moscow"
[ -n "$TEST_UPSTREAM" ] || assert_in "Europe/Moscow" "`timedatectl | grep -F 'Time zone: '`"
assert_in "Time.*zone: Europe/Moscow (MSK, +" "`timedatectl --no-pager`"

echo 'reset timezone to original'
assert_eq "`timedatectl  --no-pager set-timezone $ORIG_TZ 2>&1`" ""
assert_eq "`readlink /etc/localtime | sed 's#^.*zoneinfo/##'`" "$ORIG_TZ"
[ -n "$TEST_UPSTREAM" ] || assert_eq "`timedatectl | grep -F 'Time zone'|sed -e 's@.*: @@' -e 's@ .*@@'`" "$ORIG_TZ"

dist-git example and a gating CI job for systemd

You can look at this tests directory that contains 4 tests, one in ansible, one in python3 and 2 in shell.

We have added a CI pipeline to build and test systemd. To demonstrate the detection of an error, we have injected a patch that is breaking systemctl status and you can see that the tests are catching the issue: https://softwarefactory-project.io/jenkins/job/fedora-integration-test/7/console

Upgrade/compatibility impact

There are no real upgrade or compatibility impact. The tests will be branched per release as the dist-git repos are branched.

Evaluation

Instructions: Copy the block below, sign your name and fill in each section with your evaluation of that aspect. Add additional bullet points with overall summary or notes.

Full Name -- SignAture

Summary: ...
Staging: ...
Invocation: ...
Discovery: ...

Pierre-Yves Chibon -- pingou

Summary:
- PRO: Seems pretty simple and straight forward
- CON: Unknown tool to pretty much everyone in the Fedora space
- CON: Introduce a reliance on a tool that isn't in Fedora at the moment
- PRO: That tool is known to be working on a similar environment (the Debian test system)
- PRO: Could give us some opportunity of cross-distro sharing
- THOUGHTS: To me it seems really close to the ansible proposal, while ansible is a quite well-known tool, this one is more specific to the debian ecosystem apparently. So except for the possibility to share tests with Debian (which, granted, would be quite cool), I am not sure I see what would make this proposal superior from Ansible's.
  - MartinPitt: Structurally it's similar; main differences are the staging (autopkgtest prepares testbeds and installs subjects; here it's the test's responsibility, see comparison in the "packaged RPM tests" proposal), and that autopkgtest control files are fully declarative, while Ansible is a mix of declarative and procedural ("YAML shell script with macros")
Staging:
- PRO: Seems flexible and able to handle quite some formats
- QUESTION: Does it support invoking just a sub-set of the tests? (Say if you want to have different tests for pull-request vs updates)
  - MartinPitt: right now it can only run a single test (--testname) or all tests. There is a spec for test classes which would also cover this, or we could simply teach autopkgtest to accept --testname multiple times.
Invocation:
- PRO: easy to run locally
- PRO: easy to run as root and switch to a local user or vice-versa
- CON: inter-package dependencies won't be straight forward
Discovery:
- Git hash uniquely identifies a test suite
  - Meaning the identifier may change while the test suite itself hasn't
- PRO: Would rely on the same dependency chain as the artefacts themselves (I assume)

Stef Walter -- Stefw

Summary:
- Neither PRO or CON: Fedora can wrap Debian autopkgtests with or without making it the standard.
- PRO: Uses tooling already in active use elsewhere.
- CON: The tooling needs to be developed or finish being developed, and is not in Fedora.
- CON: Only represents in-situ testing. Does not cover outside in testing.
- CON: Does not handle test subjects other than packages.
- CON: Puts Debian style control files into Fedora. control files are usually used instead of, not in addition to, spec files.
Staging:
- PRO: Staging is completely taken care of by the autopkgtest tool
- CON: Doesn't cover staging for outside-in testing or for non-standard test subjects.
- CON: No description of where test dependencies come from? What repos?
Invocation:
- PRO: All invocation logic is encapsulated in the autpkgtest tool.
- CON: No consideration of how various test subjects would work.
Discovery:
- CON: Requires enumerating dist-git in order to find appropriate tests. Additional services will need to be built to make this information queryable without cloning dist-git repos in order to do discovery.

Michael Scherer -- Misc

Summary:
- PRO: Do not tie us to ansible, while still letting flexibility to use ansible, thus bringing the best of both world (ie, people unfamilliar with ansible can use shell, people who like ansible can use it, thus maximizing
- PRO: offer metadata for tests, so we can express contraints on them, or tag them
- PRO: declarative format (which is usually more flexible since you can put the logic outside of the format, and improve it, cf example of kubernetes vs traditional infra)
- PRO: Format is already used, so tooling/doc exist
- CON: Format is not as easy to parse as yaml
- CON: Using this and ansible make the stack more complex
- CON: Flexibility come with a cost, of not making sure everybody follow the same process. However, if we want to be inclusive (and for example reuse stuff made by others), we have to take that in account

Nick Coghlan -- Ncoghlan

Summary:
- PRO: Sharing low level testing configuration with Debian would be beneficial to both ecosystems
- CON: most of the arguments against requiring Ansible also apply to requiring autopkgtest - I think it will be easier to use RPM+shell as the assumed baseline, and then offer standard shims to bootstrap Ansible, autopkgtest, etc, for more complex cases

Search

Changes/InvokingTestsControl

Contents