Package Source Control

From FedoraProject

(Difference between revisions)
Jump to: navigation, search
(Start work on a page to describe our SCM.)
 
(Moar content.)
Line 21: Line 21:
 
Currently "/master" is appended to the top level branches.  This is due to wanting a namespace for branches beyond the top level branches, eg <code>f14/mytopicbranch</code>.  The design is such that all branches that are related to Fedora 14 would start with <code>f14/</code>.  Because of a git shortcoming we cannot have a "f14" and "f14/mytopicbranch" at the same time.  Thus, "fNN/master" was made the default branch name.  There is a [[Dist_Git_Branch_Proposal|proposal]] to change this.
 
Currently "/master" is appended to the top level branches.  This is due to wanting a namespace for branches beyond the top level branches, eg <code>f14/mytopicbranch</code>.  The design is such that all branches that are related to Fedora 14 would start with <code>f14/</code>.  Because of a git shortcoming we cannot have a "f14" and "f14/mytopicbranch" at the same time.  Thus, "fNN/master" was made the default branch name.  There is a [[Dist_Git_Branch_Proposal|proposal]] to change this.
  
[[Releases/Rawhide Rawhide]] builds come from the <code>master</code> branch.  When a new Fedora release branches from Rawhide, a new top level branch in git is created from "master".
+
[[Releases/Rawhide|Rawhide]] builds come from the <code>master</code> branch.  When a new Fedora release branches from Rawhide, a new top level branch in git is created from "master".
  
 
== Commit Emails ==
 
== Commit Emails ==
Line 60: Line 60:
  
 
== Lookaside Cache ==
 
== Lookaside Cache ==
 +
The Lookaside Cache is a storage system for upstream source archives.  Most source control systems do not handle large binary files very well so we have designed a system to archive them and reference them from within our package source control.
  
= Client Interaction =
+
Every package repository will have a <code>sources</code> file.  Within this file there is a md5sum and source file name for each source archive the package uses.  When client programs require the sources for modification or building they are fetched from the lookaside cache using the sources file for reference.  When a contributor needs to add a new source file or replace an existing source file, client software will securely upload the file (using ssl certs) and update the sources file accordingly.
 +
 
 +
= User Interaction =
 +
Users can interact with the Fedora Package Source Control in multiple ways.
 +
* Using git clients directly
 +
* Using fedpkg
 +
* Browsing web frontends
 +
 
 +
== Git Clients ==
 +
One of the goals of the Package Source Control was to allow use of standard and well known clients to interact with it.  While we do provide a helper tool for the source control actions, all interaction with the source control can be done using standard git clients.  For that reason there is no/limited special setup required when working with our git repositories.  Clone, commit, push.  As stated above git clients can be used either anonymously or authenticated with ssh keys.
 +
 
 +
== Fedpkg ==
 +
fedpkg, which is part of the [https://fedorahosted.org/fedora-packager/ fedora-packager] project, provides some "targets" to interact with the source control system.  These targets are very loose wrappers around git itself.  The intent is that the options one would pass to git are the same options one would pass to fedpkg.  The wrappers exist so that there can be a single tool maintainers use to interact with the source control system, the look aside cache, and the buildsystem.  Should we ever change backend systems maintainers can continue to use fedpkg to interact.
 +
 
 +
== Web Frontends ==
 +
Currently we are using [http://git.kernel.org/?p=git/warthog9/gitweb.git;a=summary gitweb-caching] to provide a web frontend to browse package repositories.  Performance has been an issue, particularly with loading the index of all 10K+ repositories.  The web frontend provides another way to anonymously interact with the repositories.
  
 
= Buildsystem Interaction =
 
= Buildsystem Interaction =
 +
Fedora's build system, [https://fedorahosted.org/koji/ Koji] interacts with our source control in a read-only manner.  When a maintainer requests a build, the maintainer can request that the build come from a particular commit hash within the package repository.  Koji configuration allows for a list of "allowed" SCM sources, as well as configuration as to what commands to run in order to populate a directory with the spec file, any and all patches, and any and all source archives necessary to build a source RPM.
 +
 +
Currently builds are initiated using a commit hashsum as a reference for the source.  Tagging the source is unnecessary.

Revision as of 19:54, 15 December 2010

Important.png
Work in progress.
This page is a work in progress.

This page describes Fedora's package source control system. This covers:

  • repository setup
  • authentication and authorization system
  • repository contents
  • user interaction
  • interaction with our build system.

Contents

Repository Setup

Fedora package source control consists of a set of individual git repositories, one per Fedora package. These repositories all live on a central server within the Fedora infrastructure.

The server name is pkgs.fedoraproject.org and all the repositories are named after the sourcerpm. For example the repository for the yum package is pkgs.fedoraproject.org/yum.

Repository Filesystem Configuration

All the repositories live in the /srv/git/rpms/ path on the server. This path has group sticky bit set and group owned by the packager group. Each repository is created using the --shared option making each repository "group shared". This ensures that multiple maintainers will be able to share commit access to the repositories, provided they are all in the packager group.

Branch Structure

Within each package repository there can be a set of "top level" or "default" in-repo branches. These branches are created for each Fedora or EPEL release a package may be built for. This allows for changes for one release to not depend or conflict with changes from another release. The naming of these branches currently follow a syntax of fNN/master or elN/master where branches that start with f are for Fedora, followed by the number of the release, eg the branch for Fedora 14 is f14/master. Branches that start with e and are followed by a number are for EPEL. The branch for EPEL 6 is el6/master.

Currently "/master" is appended to the top level branches. This is due to wanting a namespace for branches beyond the top level branches, eg f14/mytopicbranch. The design is such that all branches that are related to Fedora 14 would start with f14/. Because of a git shortcoming we cannot have a "f14" and "f14/mytopicbranch" at the same time. Thus, "fNN/master" was made the default branch name. There is a proposal to change this.

Rawhide builds come from the master branch. When a new Fedora release branches from Rawhide, a new top level branch in git is created from "master".

Commit Emails

When changes are pushed to the central git repos, information about those changes are emailed to two locations.

  • <package>-owner@fedoraproject.org
  • scm-commits@fedoraproject.org

The first is an email alias who's recipients are controlled by the Fedora Package Database. The second is an open mailing list hosted by the Fedora Project.

We currently use a post-receive hook from the gnome project to perform the emailing.

Authentication and Authorization

The Fedora Package Source Control system uses a layered authentication and authorization system to control access to the git repositories.

Authentication

Currently there are two ways to obtain clones of the git repos.

  • authenticated ssh:// based clones
  • anonymous git:// based clones

Authenticated ssh based clones require the client to have a valid Fedora account within our Account System, and belong to the packagers group. SSH authentication is carried out via ssh keys which are preloaded onto the git server for each potential user.

Anonymous access is through the git:// protocol and requires no authentication.

Authorization

By default git has no built in system for authorization. It relies upon filesystem permissions controlled by the operating system where the repository lives. Typically this works well enough. However within Fedora we have the concept of different access rights per Fedora/EPEL release. This manifests in the source control at the branch level. Because git has no specific filesystem segregation based on branch names we have to use an addon to git in order to achieve per-branch access rights.

Fedora Package Source Control makes use of gitolite to provide per-branch access rights. When a user attempts to push changes to a repository first the user must authenticate using ssh. Then gitolite is envoked to verify the user has any rights to the repository. If the user has rights the push attempt is forwarded on to git itself which will start its process. Eventually an update hook within the git repo is invoked which calls gitolite again to authorize whether or not the user in question has rights to commit to a particular branch path. Gitolite will check the user name against a pre-generated ACL (Access Control List) and either allow or deny the action.

The use of gitolite also allows us to have users who are allowed to have shell access to the git server and users who are not, without changing the path to where the repositories can be found.

ACL Generation

The ACLs used by gitolite are generated using data from the Fedora Package Database. This Database allows package maintainers to define who has commit access to each Fedora/EPEL release for a given package. This data is used to construct an ACL for each package and is combined with global settings which give SCM admins and secondary arch maintainers commit access to every package/branch.

ACLs are regenerated every 10 minutes via a cron job on the git server.

Repository Contents

Our repositories track changes that are important and relevant to the Fedora project as opposed to upstream changes. As such our repositories have rpm .spec files, any patches we apply, or any supplementary source files we supply. Upstream content is stored in a lookaside cache.

Lookaside Cache

The Lookaside Cache is a storage system for upstream source archives. Most source control systems do not handle large binary files very well so we have designed a system to archive them and reference them from within our package source control.

Every package repository will have a sources file. Within this file there is a md5sum and source file name for each source archive the package uses. When client programs require the sources for modification or building they are fetched from the lookaside cache using the sources file for reference. When a contributor needs to add a new source file or replace an existing source file, client software will securely upload the file (using ssl certs) and update the sources file accordingly.

User Interaction

Users can interact with the Fedora Package Source Control in multiple ways.

  • Using git clients directly
  • Using fedpkg
  • Browsing web frontends

Git Clients

One of the goals of the Package Source Control was to allow use of standard and well known clients to interact with it. While we do provide a helper tool for the source control actions, all interaction with the source control can be done using standard git clients. For that reason there is no/limited special setup required when working with our git repositories. Clone, commit, push. As stated above git clients can be used either anonymously or authenticated with ssh keys.

Fedpkg

fedpkg, which is part of the fedora-packager project, provides some "targets" to interact with the source control system. These targets are very loose wrappers around git itself. The intent is that the options one would pass to git are the same options one would pass to fedpkg. The wrappers exist so that there can be a single tool maintainers use to interact with the source control system, the look aside cache, and the buildsystem. Should we ever change backend systems maintainers can continue to use fedpkg to interact.

Web Frontends

Currently we are using gitweb-caching to provide a web frontend to browse package repositories. Performance has been an issue, particularly with loading the index of all 10K+ repositories. The web frontend provides another way to anonymously interact with the repositories.

Buildsystem Interaction

Fedora's build system, Koji interacts with our source control in a read-only manner. When a maintainer requests a build, the maintainer can request that the build come from a particular commit hash within the package repository. Koji configuration allows for a list of "allowed" SCM sources, as well as configuration as to what commands to run in order to populate a directory with the spec file, any and all patches, and any and all source archives necessary to build a source RPM.

Currently builds are initiated using a commit hashsum as a reference for the source. Tagging the source is unnecessary.