Env and Stacks/Product Requirements Document

From FedoraProject

< Env and Stacks
Revision as of 19:55, 28 March 2014 by Mmaslano (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Fedora Environment and Stacks Product Requirements Document.

Contents

Document Purpose and Overview

Vision Statement

Fedora is the preferred platform for software development and deployment in any language or application stack.

Mission Statement

The Fedora Environment and Stacks Working Group will research and develop new or improved methods of developing, testing, packaging and deploying software for the Fedora community.

What this document describes

This is the Product Requirements Document (PRD) of the Fedora Environment and Stacks Working Group. Unlike the working groups in charge of developing the three products, namely Workstation, Server and Cloud, the Environment and Stacks Working Group is not directly developing any Fedora product. Hence, this PRD should not be seen as a real PRD, but rather a document that:

  • Provides a list of tasks and goals which fulfil the mission of packaging, testing and delivering the various existing and new software stacks in Fedora. In addition, the listed tasks and goals work towards making the development on Fedora and of Fedora itself easier and friendlier.
  • Provides deeper understanding of the types of users that are currently faced with various limitations when trying to use, package and develop the various software stacks on Fedora. It also lists the types of users that would benefit from having an easier to use development environment, which would cater to their diverse needs.

This document does not dictate implementation details. The working group will drive the continued prototyping of the listed goals and tasks and graduating them to become an officially released and supported part of the Fedora Project. Schedule for the various goals and tasks will be provided in a separate document.

Definitions and Acronyms

  • API: Application Programming Interface
  • ABI: Application Binary Interface
  • BZ: Bugzilla
  • CI: Continuous Integration
  • COPR: Cool Other Package Repositories
  • CPAN: Comprehensive Perl Archive Network
  • CS: Computer Science
  • Docker: Lightweight containerization system
  • EPEL: Extra Packages for Enterprise Linux
  • FAS: Fedora Account System
  • Koji: the Fedora RPM build system
  • NTH: Nice-to-have
  • OS: Operating System
  • PRD: Product Requirements Document
  • PyPI: Python Package Index
  • QA: Quality Assurance
  • RVM: Ruby Version Manager
  • SCLs: Software Collections
  • SCM: Source Code Management
  • WG: Working Group

Tasks and Goals

Testing/additional repositories

  • Many features from this WG will need additional repositories. It needs to be determined where they will be hosted (maybe on COPR, maybe as Koji tags). Also, a policy for enabling such repositories must be defined.
  • A repository with packages used only for building might help some developers to build their package without having to maintain the build dependencies.
  • Repositories with a looser policy than the current Fedora Packaging Guidelines. Examples of what a looser policy would allow would be bundling libraries, overriding things in Base/Core. It is important to note that these repositories would still need to follow Legal/Licensing Policy as the goal for these repositories is to eventually end up being officially distributed by Fedora.

Change proposal: Playground repository

Automation

  • Additional repositories with automatically updated SPEC files and RPM packages for packages that are already included in Fedora.
    • Maintainers will benefit from having less work with updating the packages, since they would just need to review the automatically updated package and import it into rawhide.
    • Users will benefit from having the latest versions of software available. Clearly, the users will need to be informed that the provided packages are a preview available AS IS.
    • In a way, such repositories would follow the upstream release schedule:
      • As soon as the upstream makes it available on e.g. CPAN, PyPI, it would be available in the repository.
      • We would need to handle major updates with an API/ABI bump. Checking for ABI/API breakage could be done automatically with api-checker.
  • Automated packaging
    • The general idea is to enable easier/quicker packaging of upstream software by generating SPEC files for the packager automatically.
    • Various tools for automatic packaging already exist (e.g. the *2spec and *2rpm tools), so we will use them as a starting point.
    • The critical non-automated part of the packaging and review process that will remain manual will be the licensing checks.
    • By leveraging COPR, we could have automatically updated repositories generated automatically from upstream sources.
    • jzeleny is working on other automatic packaging tools.
  • Automated package review tools - Change proposal: Automated Packages Review Tools
    • Development of a new service where package reviews would happen from start to finish [1]:
      • No Bugzilla
      • Automatic FAS integration (one could automatically mark reviews that need sponsors)
      • Guided (self-)review
      • Inline comments (like gerrit)
      • FedoraReview integration
      • pingou plans to work on such a tool [2]
  • Quality Assurance (QA) automation
    • Join Fedora QA and help them with the Taskotron project concerning the future of QA automation in Fedora.
    • Develop new tools for Fedora QA that will help achieving high standards of software in Fedora.
      • An example of such a tool is rpmgrill, a set of analysis tests that run against a particular RPM build. Its main difference to rpmlint is that rpmgrill handles *builds*, that is, the entire sets of RPMs built from one source RPM file instead of single RPM files.

Integration of Fedora services/tools

  • We want to lessen/automate the work of packagers by integrating the Fedora services/tools such as fedpkg, git, Bodhi, Koji, Bugzilla.
    • An example of a git-Bugzilla integration: packager's git commit referencing a particular bug number in Bugzilla triggers automatic generation of a comment that includes the commit link and commit message and changes the state of the referenced bug to MODIFIED.
    • An example of a Koji-git integration: a successful build of a package in Koji triggers automatic generation of a tag in package's git repository so that the packager can easily access the content of the particular build using pure git.
  • This will be done in co-operation with the Fedora Infrastructure Team.

Build systems

  • We will work on improving the developer experience with the build systems in Fedora (currently, COPR and Koji). Furthermore, we will work on new features in the build systems that will enable building and delivering the various existing and new software stacks in Fedora.
    • Cooperate with the COPR upstream on adding support for building SCLs. This will allow us to provide additional repositories with multiple parallel installable versions of the software stacks.
    • Cooperate with the Koji upstream on implementing new features in the build system hierarchy that will give more information to developers and make their work easier, like:
      • getting more files from broken builds (e.g. core dumps)
      • providing information about workers' usage status and their resources utilization
      • supporting easier environment setup for implementing new complicated features (currently, creating a new target for that purpose involves quite a lot manual work; this could be solved by an ability to create ad-hoc targets, that don't need rel-eng interaction at all)

Software Collections (SCLs)

Software Collections (SCLs) are a technology that enables one to build and concurrently install multiple versions of the same software components on a given system. They also have no impact on the system versions of the packages installed by any of the conventional RPM package management utilities. As such, SCLs will be useful for providing various (incompatible) versions of software stacks in Fedora. The goals which our WG has with SCLs are the following:

  • Work to make SCLs usable by Fedora Products. Change proposal: SCL
    • Example: Fedora Cloud is a product that often depends on a specific version of Ruby, so it will benefit from having a means for providing the required version of Ruby via a SCL.
    • Get SCLs into the Fedora Mainstream Repository
      • Status: Packaging Guidelines are being reviewed and revised by the Fedora Packaging Committee. Most probably each SCL will have a review step similar to a system-wide Fedora Change.
    • Create a second repository for SCLs that do not following Fedora Packaging Guidelines
      • Cooperate with the SCL Upstream on enabling Fedora users to install and use the SCLs provided by the SCL Upstream. This will extend the number and diversity of software stacks available to Fedora users.
        • Status: SCL Upstream is almost finished and ready for launch. Need to work with FESCo, SCL Upstream, and write code to make these repositories easy to enable
      • Work with FESCo to see if we can make some subset of those repositories directly installable by the Fedora Products.
      • If not, then figure out if we need SCLs that are not Fedora compliant that are directly installable by the Products.
  • scl-utils is a set of utilities that allows one to package, build and run SCLs. We will cooperate with the scl-utils upstream to incorporate the requirements that will arise when we make SCLs available in Fedora.
    • Status: scl-utils is in maintenance mode, scl-utils2 might contain bigger changes.

Continuous integration (CI)

Continuous integration (CI) allows early detection of bugs or potential issues, thus it is beneficial for improving and maintaining the high quality of software available in Fedora. A number of tools for CI are available such as Travis CI or Jenkins.

  • For CI of RPMs built by Koji/COPR and Fedora updates submitted through bodhi, we will cooperate with Fedora QA on the AutoQA tool. Future plans for AutoQA are described in the Taskotron project.

Documentation, guidelines

  • Wiki pages.
    • Currently lacking any usable structure, lots of abandoned and outdated content, thus not being very helpful.
    • Look into creating a better structure, improve the related wiki page categorization.
    • Keep the Packaging: namespace, the official Packaging Guidelines are and will be maintained by the Packaging Committee members.
    • Archive the outdated/duplicated/unneeded wiki pages (outside the Packaging: namespace).
    • Motivate people to create new content (with badges, swags,...).
    • Improve the search on Fedora wiki.
    • Add more references to formal documentation (see below).
  • Formal documentation.
    • Some people prefer to use a single document to learn about concepts and tasks rather that browsing through a number of wiki pages. Wiki content may not feel qualified and may not invoke the same trust as formal documentation.
    • pkovar will be working on a new Packager's Guide, to help people getting started with RPM packaging for Fedora and EPEL. The goal is to share as much content as possible with downstream RHEL/EPEL documentation, using the same format and toolchain (DocBook, Publican).
    • The guide will be mostly based on the Fedora wiki pages/HOWTOs. It will reference additional resources on the wiki (esp. in the Packaging: namespace) and/or other content where appropriate.
    • An outdated draft is here: http://docs.fedoraproject.org/en-US/Fedora_Draft_Documentation/0.1/html/Packagers_Guide/index.html
  • SCL

DevAssistant

The aim of the DevAssistant project is to help developers with repetitive everyday tasks, such as kickstarting/scaffolding new projects, installing dependencies, working with SCM systems, setting up development environments and so on.

  • State of Art
    • DevAssistant itself is actually just a core (written in pure Python), which executes plugins - the so called assistants (written in Yaml DSL). These assistants specify how to create new projects, set up environments, etc. The upstream development currently takes place in three areas:
      • Improving the core.
      • Providing more assistants that end-users can actually use.
      • Providing an easy mechanism of distributing assistants independently of the core. Upstream wants to create a "central assistant index" (the name has not been chosen yet), a web repository of assistants, where users will easily contribute their own assistants and share them with others. Upstream is also thinking of dropping the set of assistants included in DevAssistant's distribution and rather leverage the central index.
  • What we will do:
    • Work with upstream on the "central assistant index". This includes working on the upstream assistant packaging format. Our goal is to make sure it will be easy to repackage the upstream assistants into Fedora RPMs.
    • When the central index is ready, we will encourage Fedora packagers (and upstreams) that maintain components "interesting" to developers (e.g. frameworks, Docker, SCM tools, tools like virtualenv/RVM/...) to contribute to DevAssitant by writing their own assistants. Since the central index will be distro-agnostic, we should be able to also gain non-Fedora upstreams' help.
    • Then we will work on improving DevAssistant's documentation and advertise its usage by developers using Fedora.

User stories

This section lists a couple of concrete user stories highlighting the problems they are facing with the current Fedora and what they would like to see improved in the future.

Persona #1: Alan the Big Data analyst

Alan is a Big Data analyst and member of the Fedora Big Data SIG. He uses a number of applications written in different languages to perform the data analysis. He wants to focus his time and effort in the application and the actual data analysis. He want to minimize the time and hassle spent obtaining, compiling, packaging and maintaining the applications that he needs. The form of packaging (rpm, deb, npm, other) isn't as important to him.

Problem statement: Currently, there is a lot of hassle and pain in dealing with non-primary (i.e. C/C++, Python, Perl) language stacks in Fedora. Although Alan wants to focus his time in the application and the actual data analysis, he too often finds himself spending time managing the language-specific toolchain needed by the application.

Cause #1: The upstream application authors usually assume that any developer would just be using the language-specific packaging ecosystem rather that also taking into account the downstream distribution-based packaging and dependency management systems. This is a problem since there are many differences between language-specific packaging ecosystems and the Fedora way of packaging software.

Cause #2: Many upstream applications use very brittle versioning. In other words, each application expects the user will be able to have its particular versions of the runtime, compiler and libraries available.

Example #1: Applications written in Scala require different versions of Scala, some need v2.9 and some need v2.10. Although Fedora could provide both versions in the same release (via SCLs or by manually maintaining two Scala packages), they would both need to be resolvable via Apache Ivy, rather than just providing both binaries, "scalac29" and "scalac210".

Example #2: The version of Jetty in Fedora does not work with Java 6 and the Apache Hadoop upstream isn't ready to abandon Java 6 yet. So the Big Data SIG ends up maintaining a patch to Apache Hadoop for Jetty 9 that will live purely in Fedora for quite a while to come. Although this could be worked-around by using compat packages for Jetty for a couple of Fedora releases, it just hides the fact that there is a mismatch in expectations between Fedora and upstream application authors.

Example #3: Node.js has its package/dep management tool available in Fedora, but very little of the language ecosystem / base libraries are packaged and available in Fedora.

Possible solution: Fedora would need a way to provide language-specific ecosystems in a way that aligns with how these language itself are used and applications written in them are developed and distributed.

Persona #2: Student or Corporate developer needing multiple development environments

Billy is a CS student with multiple assignments that need different development environments/stacks to build. Bob is a corporate enterprise developer who is both developing new applications using the latest technology stacks while also maintaining older software releases that use older libraries and tool chains.

Problem statement: While they can setup different OS's or environments in separate virtual machines, it is a lot of work and wastes a lot of disk space and time. Being able to install multiple environments or versions of stacks in the same Fedora instance and switching between them on a per-project basis would be much easier and more efficient.

About this Document

Authors

Contributors to this document include:

Reviewers & Contributors

The following people have contributed to the development of this document, through feedback on IRC, mailing lists, and other points of contact.