GSOC 2014/Student Application Msimacek/Mock improvements

From FedoraProject

Jump to: navigation, search

Proposal Description

Mock is an utility used for building RPM packages in Fedora and other RPM-based distributions. This project should improve the current mock implementation to be more flexible and speed up the packagers workflow.

Proposed improvements:

  • Better caching - imporve the performance and provide easier mechanisms for cloning and manipulating buildroots. Currently, mock uses tarballs for keeping the buildroot cache and extracts them when needed. This is slow and not very flexible. Using snapshots would provide a way to conveniently save, restore or clone the current state of the buildroot. This could speed up the workflow of packagers who have to work with many packages at once. Current options are either cleaning the buildroot with each package which takes a long time to since dependencies have to be installed again. Or manually copying the buildroot which is also slow and inconvenient. Or not cleaning the mock at all which can make some packaging mistakes (missing BuildRequires) go unnoticed. My proposed modification could present another workflow that could be faster than cleaning with each rebuild but not error prone as not cleaning at all. For example Java packages built with Maven could save a buildroot snapshot with maven-local and its long chain of dependencies and reuse it for multiple packages.
  • Provide a way to revert recent changes without having to clean the buildroot and install all dependencies over again. A similar idea to the previous. Mock could automatically make a snapshot after installing package's build dependencies, but before actually building it. This could provide a way to do manual modifications to buildroot without having to reinstall dependencies after you turned your immediate changes into patches and you want to rebuild it to see whether they work.
  • Ability to work interactively - add realtime logging to the terminal. Packager should immediately see the output of the package management system and the build system on his terminal without having to look for the logs stored deep in the filesystem
  • Allow to use mock shell during running build - mock uses one global lock per buildroot which prevents you to use the mock shell when there is already a build in progress and vice versa. More finely-grained locking would be more appropriate, because packager usually doesn't want to interfere with the running build, but just query some information or copy files.
  • Allow using rpmbuild's short-circuit mechanisms to reinitiate failed build without having to start over from the very beginning - rpmbuild provides short-circuit option that starts the execution of the build in the given phase skipping previous phases. If the package built fine but the file verification failed, don't force the packager to repeat the whole process from the very beginnning.
  • Use DNF instead of YUM to install packages - DNF is a modular package manager that is meant to replace YUM in the future. Since it is already usable, mock should be able to use it instead.
  • Provide more ways to interact with the package management system within the mock - using DNF straight from the mock shell could be more straightforward than interacting with the package manager via mock command line interface.
  • Handle user interrupts without corrupting the buildroot or leaving still running processes - If you run a mock build and realize you made a mistake and want to stop it with <C-c>, there is a high probability that your buildroot will not be usable for the next build or there will be some remaining processes that weren't terminated when you interrupted. It also should provide a way to pause the whole build, in case you need more computational power or your battery is running low due to increased resource usage.


My relevant experience:

  • I'm a Fedora package maintainer that uses the mock tool a lot. I maintain or comaintain Java- and Java-related packages such as Jetty, Lucene or several packages in the Maven stack.
  • I contribute to javapackages-tools - a set of utilities written in Python that are used to facilitate Java packaging in Fedora and also other RPM-based distributions. It is mainly used to manipulate Maven POM files and XMvn configuration.

Implementation:

  • Use either LVM or BTRFS for caching and doing snapshots of buildroots. I would probably use LVM since it's more mature and more widely available but I haven't decided yet since there is a lot of factors to consider and performance will probably be most important one. I'll do some benchmarking to see which would perform better under these specific circumstances.
  • Implement more fine-grained locking mechanisms to provide interactivity - operations such as buildroot creation or making snapshots would need an exclusive lock to operate, but the shell could be provided read only access even if there's a build already running.
  • Explore DNF capabilities regarding to installroot support - DNF is currently able to install into a specific directory, but the implementation is not yet bug-free. I'll probably need to contribute modifications also to the DNF project.

What will be the benefits to Fedora:

The improved mock should reduce the time that packagers spend waiting for their builds to complete and give them more time to focus on things that matter. That should result in having more high quality packages in fedora since more time could be spend on their actual testing.

Timeline:

   Before official coding time:
       Get in touch with current developers of mock
       Get familiar with mock's integration within Koji build system
       Do a performance and feature comparison of LVM vs. BTRFS snapshots
       Get familiar with the mocks sourcecode
   19.5 - 31.5:
       Add realtime logging to mock - redirect the output of DNF and rpmbuild to the mock output (and provide a way to suppress it)
       Implement more fine-grained locking - operations that do not modify whole buildroot will get a shared lock
   1.6 - 7.6
       Switch to using DNF instead of yum for package installation
       Try to find and fix installroot related problems in DNF
       provide a way to interact with the package management system from within the mock shell
   8.6 - 14.6
       Add support for rpmbuild's short circuit mode
       Work on the main part of the project - implementing the LVM or BTRFS based snapshots
   14.6 - 23.6 (before mid-term evaluation)
       Implement user interface for managing the snapshots
       Adapt the mock configuration to the new model
       Enhance the process management to support nicer handling of user interrupts or pausing
   23.6 - 1.7 (after mid-term)
       Document new features
       Try to get a feedback from other packagers and implement their suggestions
   2.7 - 26.7
       Improve overall user experience, exception and corener case handling
       Performance optimizations
   20.7 - 28.7
       More bugfixes
       Thorough testing
   29.7 - 9.8
       Try to get my changes merged into upstream mock

Have you communicated with a potential mentor? If so, who?

Yes, with Mikolaj Izdebski and Stanislav Ochotnicky