From Fedora Project Wiki

Revision as of 18:53, 17 September 2009 by Mikeb (talk | contribs) (Why should we use it?)

Koji Maven Support

Koji Maven Support (hereafter referred to as Koji-Maven) is an attempt to bring the same security, auditability, and reproduceability to Java builds that Koji brings to rpm builds, without having to radically alter the build processes of upstream Java projects.

What is it?

Koji-Maven is a wrapper around the Maven build tool, in the same way Koji is a wrapper around the mock build tool. Koji-Maven manages the repository of jars from which Maven pulls build dependencies, and tracks the build environment, recording what jar files are downloaded into the build environment. Build output is added to the repository for use by subsequent builds, and build logs are stored centrally. A "wrapper" rpm can be generated that contains exactly the same jars generated by the Maven build, and can then be used by subsequent rpm-based builds. And rpm-based builds which generate jars can add those jars to the Maven repository for use by subsequent Maven builds.

How does it work?

Package Management

Koji-Maven manages Java artifacts (.jar, .war, .ear, etc. files) natively, in much the same way that Koji manages .rpm files. Koji-Maven extends the concept of a Koji build so it can be associated with both rpms and Java artifacts. These builds may then be added to a tag, and repositories can be created from groups of tags. For each tag associated with a build target, Koji-Maven can be configured to create a Maven repository, in addition to the yum repository. This repo will contain all Java artifacts associated with a build that is associated with that tag (or another tag in the group). Each tag can be configured to include only artifacts from the "latest" build of a given package (consistent with the yum repositories), or to include all versions of a package associated with the tag. This is required because a single Maven build may depend on more than one version of the same package. Once the repositories are created they are used by the builders for creating an environment suitable for building other packages.


A Koji-Maven build environment is created the same way as a standard Koji build environment, by using mock to install a group of packages, including "maven2" (which will do the actual building), into a chroot. The source code and any patches are then downloaded from a source control system and the patches applied. A settings file is written into the chroot that points Maven at the Koji-managed Maven repo, and uses that as the override for all project-defined repos (using <mirrorOf>*<mirrorOf>). /usr/bin/mvn is then run against the source tree. The source tree must have a .pom file in its top-level directory for Maven to process it.

The Maven build is performed in two steps. First /usr/bin/mvn dependency:resolve-plugins is called. This downloads all necessary plugins from the Koji-managed Maven repo into the local Maven repo. Koji-Maven then scans the local repo to determine which artifacts have been downloaded, and records that information in the database. If all plugins are resolved successfully, then /usr/bin/mvn deploy is called to perform the actual build. Once that is done the local repository is scanned again to identify any additional build dependencies that were downloaded, and records those as well. The build output is then uploaded to the hub where it is processed, recorded in the database, and added to the Koji-managed Maven repo for later distribution or use by other builds.

The developer can also specify the location in source control of a specfile fragment that can be used to package the jars/wars/ears/etc. generated by the Maven build into a rpm. The specfile fragment can use the Cheetah templating language to perform substitutions, conditionalize parts of the specfile, or execute simple logic. The template is passed a defined set of data, including the name, version, and release of the build, and a list of all output generated by the Maven build. Once the template has been processed, it is placed into a directory with the output of the Maven build and rpmbuild is run to generate the "source rpm". This srpm will actually contain the binary jars generated by the Maven build. Once the srpm is generated, it is built in a pristine mock chroot like any other rpm built in Koji. Once this "wrapper rpm" build is complete, the rpms are associated with the existing Koji build and are available for testing and distribution, like any other rpms in Koji.

Why should we use it?

Maven has emerged as the defacto standard Java build tool, and more and more projects are using it upstream. However, the Maven build model relies on downloading pre-built binary jars from potentially untrusted repositories on the Internet, with no link back to the source code and no way to verify who built what when. This is incompatible with the objectives and policies of the Fedora Project, and incompatible with a robust, reliable, reproduceable, and auditable build and release process. Koji-Maven was designed to address these issues by managing all Maven artifacts locally, and providing a link between the source code, the binary jars, and the build process and environment.

The alternative is to run Maven from within rpmbuild, and Fedora Java packagers have gone through significant effort to make this happen. However, it has a number of problems. A fundamental assumption of Maven is that every version of every package goes into a global repository ([1] is the largest, but there are a few others) and that Maven has access to this repository at build time. Maven builds are free to depend on any version of any package in that repository, and builds will often pull down multiple versions of the same package to satisfy plugin and build dependencies. This is at odds with rpmbuild, which assumes that everything is available locally, and discourages network access during the build. Fedora Java packaging works around this by patching Maven to support use of /usr/share/java as a Maven repo, patching the .pom files to support this local repo, and by maintaining a set of XML files that map the names and versions of dependencies (as reflected in the project's .pom file) to jars with different names and versions that are provided by installed rpms. The denormalization of dependency information, generation and maintenance of the patches, and divergence from the upstream build process are all barriers to getting new Java packages building in Fedora. It requires significant initial effort, and the complexity and fragility of the build process means that the maintenance burden is increased. Upgrading a single package to a new version can cause a large number of dependent packages fail rebuilds, because version numbers are hard-coded in the dependency maps. See the jetty specfile for an example of a Fedora Java package being built with Maven.

Koji-Maven aims to alleviate a lot of these problems by decoupling the build process from the packaging format. Rather than forcing Maven to operate from within rpmbuild, something it was never designed for, we run Maven directly, and use rpm simply as a packaging format to bundle the output of the Maven build process. This removes the need to patch .pom files or generate dependency maps. The build process is exactly the same as what the upstream developers are doing every day, so build problems will be identified and fixed quickly. The specfile fragments that are used to package the Maven output are very simple and can be reused between projects in many cases. This allows Java developers to focus on improving the code, rather than getting mired in complex packaging issues, and should encourage the inclusion of many more Java projects into the Fedora distribution.