From Fedora Project Wiki

Revision as of 10:02, 1 July 2019 by Fbo (talk | contribs) (What is Zuul)

Zuul Based CI

What is Zuul

Zuul [1] is the CI and gating system from the Open Infrastructure Project [2]. It is able to scale fine and handles by default features such as artifacts sharing between jobs and cross Git repositories testing. You can see Zuul in action here [3].

Below is a list of features proposed by Zuul and its companion Nodepool:

  • Event-driven pipelines based on Code-Review or Pull-Request workflow: jobs can be triggered automatically when a PR is submitted, changed, approved, merged, or when the repository is tagged.
  • CI-as-code: jobs are defined as YAML + Ansible playbooks, pipeline definitions as YAML files. Zuul reads and loads those definitions directly from Git repositories.
  • Support for jobs inheritance, jobs dependencies, jobs chaining (with artifacts sharing).
  • Speculative testing of new jobs before merging: jobs will be run as they are submitted to make sure they behave as expected.
  • Cross repositories dependencies: a jobs' workspace can include unmerged patches from other projects if specified
  • Parallel job run, only capped by resources available or predefined quotas
  • Automated jobs resources lifecycle management: resources like VMs or containers needed by a given job can be defined in-repository, spawned on demand at a job's start, and destroyed when the job is finished, or held for debugging
  • Job resources support of OpenStack, OpenShift, K8S, Static nodes, AWS.
  • Well-defined, reproducible job environments to eliminate flakiness
  • Speculative testing before merging (gating): if several patches are about to land at the same time, they are tested on the repository's future state.

Until now, Zuul was only able to listen to Gerrit or Github events. Recently a new driver [4] allow Zuul to interface with Pagure as well. Pagure, Zuul and Nodepool could therefore combine into a very efficient CI/CD stack.

Pagure PR tests via Zuul

We have created a Zuul driver for Pagure that allow Pagure to benefit all the nice features of Zuul. It only relies on the Pagure web hook and API system.

We are able to run jobs and report results on PRs opened on Pagure instances like https://src.fedoraproject.org or https://pagure.io.

To do so we have deployed a Zuul/Nodepool instance (From the Software Factory project https://www.softwarefactory-project.io/) here: https://fedora.softwarefactory-project.io/zuul/

Some POC use cases for the PR workflow

Some PR workflows for src.fedoraproject.org

  • When a PR is proposed or changed or at the packager request (by typing a specific PR comment in Pagure)
  1. Parent job to scratch build the package on Koji
  2. Child job to run in package functional tests
  3. Child job to run RPM lint

Here is an example of such workflow on a PR to rpms/python-gear [8]

  • When the PR is merged or at the packager request (by typing a specific PR comment in Pagure)
  1. Job to to build of Koji is performed

See this Zuul UI page [9] to view jobs attached to a project.

Zuul has a branch matching system that make a job behave differently according to the branch where the PR is opened. That means PR on master could build on the Koji rawhide target and validate on a rawhide node, a PR on f30 branch could build against the f30 target and validate on the f30 node.

Advanced scenarios that involve multiple packages could be validated at PR level. For instance a PR on rpms/mod_wsgi could have a dependency on a rpms/httpd PR (assuming both projects have a rpm build job attached and based on Zuul). The jobs for the PR on rpms/mod_wsgi could use the RPM artifacts built for the dependent rpm/httpd PR for build (BuildRequire) and validation (Require). The dependencies chain is not limited to one dependency.

How Git repositories are attached to Zuul

Zuul serves a web server with a dedicated endpoint to receive web event hook notifications sent by Pagure. Events are the source that will trigger actions Zuul side like a job execution. To report back CI status, comments, or even merge on Pagure (gating), Zuul relies on the Pagure REST API.

Zuul needs a project API token to act on the Pagure REST API and a project web hook token to validate event payloads sent from Pagure to the Zuul endpoint. Both tokens are per project on Pagure thus to scale Zuul needs a user API token set with the "Modify an existing project" right to read the web hook token and create project API tokens. The owner of this user API token must be added as project admin.

For instance on https://stg.pagure.io there is already a bot account for Zuul called zuulbot [10]. To attach a project from this staging instance of Pagure to https://fedora.softwarefactory-project.io 's Zuul here is the process:

Finally https://fedora.softwarefactory-project.io 's Zuul must be tell to handle the project. This is done by opening a PR here https://pagure.io/fedora-project-config/blob/master/f/resources/fedora.yaml and have it merged. Feel free to try !

How to attach job(s) to a Git repository

Let's have a look to jobs attached to rpms/python-gear [11].

The project's pipelines definition is located in [12]. There is a use of a template called "basic-check" [13]. The template defines which jobs will run in check, gate or post pipelines.

Check, Gate, Post pipelines are defined here [14]. Basically a pipeline defines which Pagure Event trigger jobs and what action Zuul wiil take when a job is a success or a failure.

  • rawhide-rpm-koji-scratch-build in fedora-zuul-jobs [15]
    • based on a parent job [16]
    • notice that job definition format is purely YAML following a format expected by Zuul [17].
    • the base job use post-run and run playbooks from [18]
    • the base job define a Zuul secret to authenticate on Koji
    • playbooks use roles from pagure.io/zuul-distro-jobs project [19] (see roles: [{zuul: zuul-distro-jobs}] on the base job definition).
  • rawhide-rpm-tests in fedora-zuul-jobs [20]
  • artifact-rpm-lint in zuul-distro-jobs [21]

Current architecture

Configuration

Zuul and Nodepool are hosted on https://fedora.softwarefactory-project.io. Here is the list of Pagure repositories that contain the configuration:

  • pagure.io/fedora-project-config [22] Contains the Software Factory configuration. This is where Nodepool providers, Nodepool images are defined and where Git projects are attached to Zuul. Each PR proposed on that repository is validated by Zuul and deployed once merged. For instance see the "config-check" job on https://pagure.io/fedora-project-config/pull-request/18. And the deployment job "config-update" here [23]
  • pagure.io/fedora-zuul-jobs-config [24] Contains trusted Zuul jobs configuration. This a Zuul trusted repository. Config changes included on PRs on that repository won't be taken speculatively in account by Zuul. A trusted repository is best suited to host pipelines, project's pipelines, secrets.
  • pagure.io/fedora-zuul-jobs [25] Contains Zuul jobs configuration but as untrusted repository. Any changes to the Zuul configuration (via a PR) on that repository will be taken in account speculatively by Zuul.
zuul-distro-jobs

pagure.io/zuul-distro-jobs [26] is a generic suite of jobs and roles for Zuul dedicated for the build and publish RPMs (and hopefully containers in the future). We are working on it with the idea to provide a ready to use Zuul jobs/roles library for Zuul users having to handle RPM build in the CI. Currently there is support for Koji, Copr, DLRN, Mock. Feel free to add more !

Nodepool nodes

For the moment three node labels are defined in Nodepool [27]

  • Fedora 29 cloud image (VM) [28]
  • Fedora 30 container (runc) [29]
  • Centos 7 container (runc) - This is the by default container provided by Software Factory. It is not customizable.

Additional image definitions and containers could be simply provided in fedora-project-config repository.

Buildsys build validation via AMQP and Zuul

We have some services that use Zuul to run jobs based on event received on fedmsg.

The goal was to run a rpm-lint job when a package is built on Fedora's Koji.

Unfortunately Zuul is not designed out of the box to handle AMQP messages mainly because Zuul is designed to react on events generated by a Git or Code Review system where each event belong to a Git repository.

Nevertheless it is possible to simulate a Git repository when an event on fedmsg arrive and send the proper and expected Git events to Zuul.

Architecture

The Zuul Gateway

The Zuul gateway [30] is a service that generates virtual Git references to trigger Zuul events from non git events. It simulates a Pagure instance and interact with Zuul via the Zuul's Pagure driver. Zuul gateway is controlled via a REST API.

The Fedora messaging Consumer

The Fedora messaging Consumer [31] (See Consumer class) is a simple fedora-messaging [32] Callback that filter events of interest (buildsys.build.state.change) and write events as JSON file on the files system (new/ directory). Those event files will be consumed by another process (processor).

The Fedora messaging Processor

The Fedora messaging Processor [33] (See the main function) manage to (in a loop):

  • look for an event file on the files system (generated by the consumer in new/ directory)
  • call Koji to fetch the build tasks data and fetch the list of built rpms
  • call the Zuul gateway to convert each event to a fake in memory Git repository (and create a fake .zuul.yaml)
  • call the Zuul gateway to trigger a fake Pagure style event for Zuul
    • Zuul reads the fake Git repository and process the job
  • look for the status of the Zuul jobs
  • move the event from new/ to done/ when the job finished
  • report the status on fedmsg -> We don't have an account to publish on the bus at the moment.

Access the jobs results

Triggered jobs are processed by Zuul so jobs results are available in the builds page of Zuul [34]. Here [35] is a the result of the event [36].