From Fedora Project Wiki

< Changes

Revision as of 15:06, 11 April 2017 by Tflink (talk | contribs) (Evaluation: adding my thoughts on the proposal)

Ansible: Standard Discovery, Staging, Invocation of Integration Tests

Warning.png
This is a proposal
Feedback is more than welcome. There's a discussion tab above.

Summary

Lets define a clear delineation of between a test suite (including framework) and the CI system that is running the test suite. This is the standard interface.

Invoking-tests-standard-interface.png

What follows is a standard way to discover, package and invoke integration tests for a package stored in a Fedora dist-git repo.

Many Fedora packages have unit tests. These tests are typically run during a %check RPM build step and run in a build root. On the other hand, integration testing should happen against a composed system. Upstream projects have integration tests, both Fedora QA and the Atomic Host team would like to create more integration tests, Red Hat would like to bring integration tests upstream.

Owner

  • Name: TODO: Fill in owner here. Maybe pingou, tflink, other? More than one owner is possible.
  • Email: TODO: Fill in email of owner here.

Terminology

  • Test Subject: The items that are to be tested.
    • Examples: RPMs, OCI image, ISO, QCow2, Module repository ...
  • Test: A callable/runnable piece of code and corresponding test data and mocks which exercises and evaluates a test subject.
  • Test Suite: The collection of all tests that apply to a test subject.
  • Test Framework: A library or component that the test suite and tests use to accomplish their job.
  • Test Result: A boolean pass/fail output of a test suite.
    • Test results are for consumption by automated aspects of a testing systems.
  • Test Artifact: Any additional output of the test suite such as the stdout/stderr output, log files, screenshots, core dumps, or TAP/Junit/subunit streams.
    • Test artifacts are for consumption by humans, archival or big data analysis.
  • Testing System: A CI or other testing system that would like to discover, stage and invoke tests for a test subject.

Responsibilities

The testing system is responsible to:

  • Build or otherwise acquire the test subject, such as package, container image, tree …
  • Decide which test suite to run, often by using the standard interface to discover appropriate tests for the dist-git repo that a test subject originated in.
  • Schedule, provision or orchestrate a job to run the test suite on appropriate compute, storage, ...
  • Stage the test suite as described by the standard interface.
  • Invoke the test suite as described by the standard interface.
  • Gather the test results and test artifacts as described by the standard interface.
  • Announce and relay the test results and test artifacts for gating, archival ...

The standard interface describes how to:

  • Discover a test suite for a given dist-git repo.
  • Uniquely identify a test suite.
  • Stage a test suite and its dependencies such as test frameworks.
  • Provide the test subject to the test suite.
  • Invoke a test suite in a consistent way.
  • Gather test results and test artifacts from the invoked test suite.

The test suite is responsible to:

  • Declare its dependencies such as a test framework via the standard interface.
  • Execute the test framework as necessary.
  • Provision (usually locally) any containers or virtual machines necessary for testing the test subject.
  • Provide test results and test subjects back according to the standard

The format of the textual logs and test artifacts that come out of a test suite is not prescribed by this document. Nor is it envisioned to be standardized across all possible test suites.

Requirements

  • The test suite and test framework SHOULD NOT leak its implementation details into the testing system, other than via the standard interface.
  • The test suite and test framework SHOULD NOT rely on behavior of the testing system other than the standard interface.
  • The standard interface MUST enable a dist-git packager to run a test suite locally.
    • Test suites or test frameworks MAY call out to the network for certain tasks.
  • It MUST be possible to stage an upstream test suite using the standard interface.
  • Both in-situ tests, and more rigorous outside-in tests MUST be possible with the standard interface.
    • For in-situ tests the test suite is in the same file system tree and process space as the test subject.
    • For outside-in tests the test suite is outside of the file system tree and process space of the test subject.
  • The test suite and test framework SHOULD be able to provision containers and virtual machines necessary for its testing without requesting them from the testing system.
  • The standard interface SHOULD describe how to uniquely identify a test suite,

Detailed Description

This standard interface describes how to discover, stage and invoke tests. It is important to cleanly separate implementation details of the testing system from the test suite and its framework. It is also important to allow Fedora packagers to locally and manually invoke a test suite.

Staging

Tests files will be added into the tests folder in the dist-git of the package that they are testing. The structure of the files and folders is left to the liberty of the packagers but there should be a run_tests.yml playbook at the top level of the tests folder to set up and run all the tests.

Invocation

The test can be invoke simply by calling sudo ansible-playbook run_tests.yml on the run_tests.yml playbook of interest.

Discovery

A testing system needs to be able to efficiently answer the question "does this subject have any tests packages, and if so, what are their names". This should be automatically discoverable to the extent possible.


Use repoquery, basically I propose we rely on the dependency chain of the RPMs itself instead of trying to replicate it differently.

repoquery --whatrequires or an equivalent relying on mdapi: https://apps.fedoraproject.org/mdapi/ (which I need to adjust to support back walking (ie find which packages requires "foo" instead of what packages "foo" requires which we currently have) and we should be able to build a list of dependencies.

Test Output Collection

This will enable us to collect full consistent output regardless of the test output to report with ansible invocation

https://github.com/openstack/ara


In addition, a test suite can be uniquely identified using the git hash of the commit of the git repo.

Scope

Since the tests are added in a sub-folder of the dist-git repo, there are no changes required to the Fedora infrastructure and will have no impact on the packagers' workflow and tooling.

Only the testing system will need to be taught to install the requirements and run the playbooks.

Benefit to Fedora

Developers benefit by having a consistent target for how to describe tests, while also being able to execute them locally while debugging issues or iterating on tests.

By staging and invoking tests consistently in Fedora we create an eco-system for the tests that allows varied test frameworks as well as CI system infrastructure to interoperate. The integration tests outlast the implementation details of either the frameworks they're written in or the CI systems running them.

TODO: note any additional benefits to Fedora.

User Experience

A standard way to package, store and run tests benefits Fedora stability, and makes Fedora better for users.

  • This structure makes it easy to run locally thus potentially reproducing an error triggered on the test system.
  • Ansible is being more and more popular, thus making it easier for people to contribute new tests
  • Used by a lot of sys-admin, ansible could help sys-admin to bring test-cases to the packagers and developers about situation where something failed for them.

Upgrade/compatibility impact

There are no real upgrade or compatibility impact. The tests will be branched per release as spec files are branched dist-git is now.


Full Structure

 .
 └── tests
    └── test-case
    └── config
    └── playbooks
        └── group_vars
        └── roles
        │   └── configure
        │   │   └── defaults
        │   │   └── files
        │   │   └── handlers
        │   │   └── meta
        │   │   └── tasks
        │   │   └── templates
        │   │   └── vars
        │   └── run_tests
        │   │   └── defaults
        │   │   └── files
        │   │   └── handlers
        │   │   └── meta
        │   │   └── tasks
        │   │   └── templates
        │   │   └── vars
        └── configure.yml
        └── run_tests.yml

Tests will live under tests directory in a dist-git repo. The playbooks directory will define the roles for configuration and execution of the tests. The run_tests.yml will call roles necessary and dependencies of other roles can be defined there or in the meta of another role. (Well documented on writing ansible playbooks) I put the config as a place holder for configuration files needed or for provisioning (thinking of linch-pin https://github.com/CentOS-PaaS-SIG/linch-pin) Note :This does not mean all these role sub-directories are required this just shows a full example case

Examples

What follows are examples of writing and/or packaging existing tests to this standard.

TODO: Put general example notes here.

Example: Simple in-situ test

A simple downstream integration test for gzip can be found at: https://pagure.io/ansible_based_tests/blob/master/f/tests/gzip

This is how the folder structure looks like:

 .
 ├── files
 │   └── test-simple
 └── run_tests.yml

And the content of run_tests.yml is:

---
- hosts: localhost
  remote_user: root
  tasks:
  - name: Install the requirements
    package: 
      name: "{{item}}" 
      state: latest
    with_items:
      - coreutils
      - /sbin/install-info
      - gzip
 
  - name: Create the folder where we will store the tests
    file: 
      state: directory
      path: "{{ item }}"
      owner: root 
      group: root
    with_items:
     - /usr/libexec/tests/gzip/

  - name: Install the test files
    copy: 
      src: "{{ item.file }}"
      dest: "/usr/libexec/tests/gzip/{{ item.dest }}"
      mode: 0755
    with_items:
     - {file: test-simple, dest: test-simple }

  - name: Execute the tests
    shell: /usr/libexec/tests/gzip/test-simple

Example: GNOME style "Installed Tests"

A downstream integration test running in gnome installed tests can be found at: https://pagure.io/ansible_based_tests/blob/master/f/tests/gzip full example structure: https://pagure.io/ansible_based_tests/blob/master/f/tests/gzip/playbooks

Example: Tests run in Docker Container

An integration test running tests in a docker container can be found at: https://pagure.io/ansible_based_tests/blob/master/f/tests/glib2 full example structure: https://pagure.io/ansible_based_tests/blob/master/f/tests/glib2/playbooks

Example: Modularity testing Framework

TODO: Port an example

Example: Ansible with Atomic Host

TODO: Port an existing test

Example: Beakerlib based test

TODO: Port and shim a beakerlib test

Evaluation

Instructions: Copy the block below, sign your name and fill in each section with your evaluation of that aspect. Add additional bullet points with overall summary or notes.

Full Name -- SignAture

  • Summary: ...
  • Staging: ...
  • Invocation: ...
  • Discovery: ...

Stef Walter -- Stefw

  • Summary:
    • PRO: Ansible is readable and approachable
    • PRO: Tests are stored in same repo as tests
    • PRO: Inclusion of upstream tests seems to require packaging them as RPMs.
    • CON: Ansible is another technology (in addition to RPM spec files, etc.) that packager is required to learn in order to maintain a package in dist-git.
    • CON: If tests become a core Fedora concept (which we hope), Ansible becomes a core technology that Fedora requires and is built upon.
    • CON: Most Ansible modules require Python 2.x while the distro is trying to move to Python 3.x
    • CON: No standard mechanism for passing a test subject to a test suite implementing the standard test interface
    • CON: No standard mechanism for reporting test log, or test artifacts from standard interface
    • CON: No way to describe whether tests are compatible with or conflict with specific NVR of test subjects.
  • Staging:
    • No mechanism for passing a test subject (eg: a built package, a module, or a container) to the test suite to operate on.
    • No guidance on what Ansible modules should be used to install test dependencies
    • No mechanism for a test system to control which repo of known-good packages to pull test or test suite dependencies from.
    • Requires sudo, dnf, git, ansible, python2-dnf, libselinux-python as well known staging dependencies
  • Invocation:
    • Seems that zero exit code from sudo means success, non-zero exit code means failure? Not described explicitly in standard.
    • The use of sudo seems to imply invocation should happen as a non-root user. Is this correct?
    • Does the standard assume sudo is guaranteed to work? Should the sudo part just be dropped and require invocation as root?
    • No mechanism for reporting logs, or test artifacts has been described.
  • Discovery:
    • Mechanism is simple, but no concrete description of how exactly this works. How does a testing system find tests given a test subject such as an RPM or NVR?
    • MDAPI link is broken: https://apps.fedoraproject.org/mdapi/

Martin Pitt -- mpitt

  • Summary:
    • I agree to what Stef said above, so I just add my "delta" review.
    • PRO: I prefer keeping tests in the sources (like in this proposal) over packaging tests, as it's much less overhead for the packager and avoids having to create a new kind of package archive.
    • CON: My main concern is that the Ansible format/tool might be replaced with something else in a few years, but the test format should be stable for a long time to avoid having to port hundreds/thousands of tests.
    • CON: The ansible format is relatively verbose and too procedural for my taste; I prefer a purely declarative syntax and avoiding boilerplate for installing test deps and invoking the tests.
  • Staging:
    • Not supporting test subjects is a major gap in the prototype - this is one of the core requirements here!
    • Installing the actual tests is unnecessary overhead in the playbook, and clutters the host system with files in /usr that don't belong to a package; this can be rectified though with dropping the "Create folder"/"Install" tasks and replacing the run part with
- name: Execute the tests
  script: files/test-simple
  • Invocation:
    • Getting live logs from the test and also saving it as an artifact is crucial, this is a major gap in the prototype. Can ansible do this somehow?
  • Discovery:
    • Checking out and inspecting hundreds/thousands of dist-gits whether they contain tests does not meet "able to efficiently answer the question..."; this needs a new service which regularly indexes all dist-gits and creates list of source packages that have tests.


Pierre-Yves Chibon -- pingou

  • Disclaimer: I am one of the owners above.
  • Summary:
    • PRO: Ansible is a well-know technology for sys-admin making it easier for them to contribute tests
    • CON: While being well-know for some people, it will be new for others
    • PRO: Very flexible it gives the packagers all the flexibility to install/configure/run their tests as they wish
    • PRO: We could use --tag to allow running just a part of the test suite at certain time (-t PR to run on pull-request -t updates to run on bodhi updates...)
    • CON: We may need to "regulate" the flexibility to suggest a set of standard/gold practices to be used in the test system (using different tags or playbook if we want)
  • Staging:
    • PRO: its flexibility makes it easy to test anything
    • CON: we will need to write policies/guidelines on how to test the different subject (RPM, container, images...)
  • Invocation:
    • PRO: easy to run locally
    • PRO: easy to run as root and switch to a local user or vice-versa
    • PRO: easy to couple with something like vagrant to allow running locally destructive tests
    • CON: May require policy to set expectations and document how to move from one to the other
    • CON: Inter-package dependencies is a challenge that can be overcome with a custom ansible module allowing to git clone other dist-git repo and while allowing us to block other network accesses (to avoid downloading random things from the internet that may be gone tomorrow and thus kill the reproducibility aspect).
  • Discovery:
    • Git hash uniquely identifies a test suite
      • Meaning the identifier may change while the test suite itself hasn't
    • PRO: Relies on the same dependency chain as the artefacts themselves
    • QUESTION: What is the aim here? Do we really want to run all the tests of every perl module for every change made to the perl package? If so, good luck, otherwise repoquery --whatrequires <pkg> should do what we want.


Tim Flink -- Tflink

  • Disclaimer: I am one of the owners of this proposal
  • Summary:
    • PRO: Storing tests in this way decouples them from the build process
    • PRO: Ansible has better docs and more examples than Fedora packages or RPM do
    • PRO: non-packager testers don't have to learn RPM syntax
    • PRO: Able to provide a lot more in the way of convenience functions to the test author - galaxy, roles/modules that we provide
    • PRO: easy to change tests during devel, does not require a dedicated path in the filesystem
    • PRO/CON: More easily extendable
    • CON: Adds ansible et. al as a dependency for the test process - what happens if ansible changes or if it becomes unattractive 5-10 years from now?
    • CON: Adds additional thing that packagers have to learn
    • CON: We would have no control over when/how ansible changes
    • It's not incredibly clear what all would be distributed (ansible modules, plugins) or how those would be distributed (galaxy-ish, package, etc.)
  • Staging:
    • There is no obvious way to say what NVR is under test other than looking at what's installed or what's locally available pre-build
  • Invocation:
    • Not sure sudo is required, it would likely be easier to have a plugin (if required) that ran things in a temp dir kind of how we do with libtaskotron today
  • Discovery:
    • While arguably more complex than the -tests package proposal, the additional complexity in terms of code to be written doesn't seem to be much more complex
    • There are systems already doing some parts of this discovery and could likely be re-used to a certain extent (Taskotron's trigger)