From Fedora Project Wiki

Data Layers

For the data the layering is slightly different. The data layers are relevant from the users view and the model may help when designing UIs.

  • Layer 0: Source packages
  • Layer 1: Packages
  • Layer 2: Applications
  • Layer 3: Groups
  • Layer 4: Repositories and Products
  • Layer 5: Repository Registry (Does not exist yet)

Layer 2: Applications

Right now "applications" are basically just a sub set of packages that have been selected as "interesting for the user to install". Typically this excludes libraries and development packages (although they are interesting to install for developers) but includes services that might be selected by hand like web servers. In Fedora the optional and default packages in comps groups serves as applications in this sense.

The AppStream project aims for making this layer more rich by actually adding informations to the packages selected.

Layer 3 Groups

There is some confusion in the Fedora/Red Hat world what a group is as the same term and data structure is used for two things:

  1. Categories to put applications in (putting together similar applications)
  2. Creating a entity that can be installed at once providing a larger scale functionality (putting together one of each kind)

For the scope of this layering we will probably still need to divide these two into separate layers.

Layer 4 Repositories

A repository is a set of packages and Layer 3 information that is provided by a single vendor. Typically the repository also contains package meta data in a condensed form to allow easier dependency solving.

Repositories are the right level to decide whether you trust a piece of software or not. Although single packages are typically signed the keys for signing are per vendor or per repository. Looking as single packages to establish trust is often not possible due to the large number of packages and the uncertainty of which packages are required or will be required in the future. To create this trust adding repositories cannot be an automatic process.

The community distributions tend to see the repository layer as a technical necessity rather than an important structural element. They are aiming for the need for several repositories to go away. In the "perfect Fedora world" there would not be a need for non free or other software that can not be included in Fedora's repo. The fact that there are several repositories within Fedora has mainly technical reasons (fedora vs updates) or is not to be shown to the user (testing). For legal reasons Fedora is also not talking about the few add on repositories that may have their reason for existence. This enforces the "There can be only one" view.

In the enterprise world (and even with closed source freeware for the community distributions (e.g. the adobe repository)) a repository is closely related to a product. While packages are often split up upon purely technical reasons repositories give vendors the possibility to put everything needed together and also offers a easy way to provide updates. As repositories are implemented as a web accessible directory it is also the right level to establish access control - if this is needed.

Layer 5 Repository Registry

As far as I know such thing does not exist right now. In a world of lots of repositories this might be necessary but we are not in this world. Such registry would also have to solve the question of trust. Either by creating some meta distribution that is trusted and states the trustworthiness of the single repositories or by pointing the user to the vendors to they can decide whether they trust them themselves.

Orders of Magnitude

The data layers are true abstractions of the underlying ones. Which means that the number of entities are getting smaller to the top (with exception of Layer 0):

Layer Entities in Distributions
Community Enterprise Installed
0 Source 10k 2k -
1 Packages 20-30k 3k 1-2k
2 Applications 2k 1k (?) 100?
3 Groups 30-100 30-100 10
4 Repositories ~5 (?) 1-3