From Fedora Project Wiki

Revision as of 12:09, 12 May 2011 by Ffesti (talk | contribs) (Rewrote large part of the document to make it more readable. Also moved it more into yum's terms)

The Next Layer

Motivation

Handling the huge number of packages has become more and more difficult over the years. While the number of installed packages has not grown as fast as the available packages it is still close to impossible to understand what software is installed on a system by looking at the package list. Ideally the system should not look more complicated that the decisions the user made to set it up. If the user only selected half a dozen of groups and installed 20 additional packages the description of the system should not have more than those 26 items.

There are several issues that make a better representation of the installed packages impossible:

Comps groups are just used for installing. Although yum does store that a group got installed somewhere this information is no longer used. Neither do changes of the group have any consequences nor can one reliably link groups and packages.

While anaconda - and for a while also yum - supports installing a group except some of the packages such concept is gone as soon as the transaction is finished. This makes bringing groups and the installed packages together even more difficult. As many administrators only use comps groups as a starting point for further refinement this is a serious problem.

To solve this issues groups need to become first class citizens in the yum world. Actually they need to be come a layer that is on top of packages as they already are in the repositories. But they still have to gain the same status on the system.

This could also solve another problem: Telling packages only installed from dependencies from packages wanted and installed by the user. Actually yum already stores this information on a per package basis. Nevertheless this information is of very little use right now. One reason is the missing understanding of groups which make basically half of the packages "selected by the user" even if he made just 20 decision for installing his machine.

The only way to solve all this is to store the decisions the user made in an editable form. This form then needs to be translated into a actual list of packages to be installed. Groups could be this form.

Installed Groups

Comps groups are more than just a set of packages. There is no reason to believe that installed groups should be any simpler than comps groups. Actually an installed comps group will contain a copy of the original comps group plus add the decisions made while installing the group. This will be what set of packages (mandatory, default, optional) are to be installed and what (optional) packages are selected in addition or excluded.

Whenever an installed group is changed or removed or the associated comps group changes the installed packages need to be adjusted accordingly. Packages no longer needed by the group (and nothing else) should be silently removed. This frees the administrator from cleaning up no longer needed packages by hand.

This model raises the question what to do with the packages not in a (comps) group?

Standalone Groups

One solution would be creating groups on user demand that are not connected to any comps group. The user could add just arbitrary packages. May be other groups or comps groups can deal as template but they would often just be created from scratch. This would allow the administrator to group packages by purpose. An additional description or comment could be used for documentation. Beside just grouping the packages those groups could also document who uses or requires the packages. Packages needed for different reasons could be part of different groups and would be kept around while any of this groups still exists.

Groups only

The question is if it really is necessary to have anything else than groups. The current mechanism of marking packages as "user installed" and "by requirement" on installation time does not reflect the actual use cases precisely. Even if a package got installed as an dependency it might still be or become demanded by a user. The current implementation does not offer a way of switching the status except reinstalling the package which is a pretty big hammer for that nail. While a command to alter the status could be added easily there is an alternative implementation: Use some sort of default group packages installed "by hand" get added to. This group could then be edited to change the status of a particular package. This would also make such packages accessible to all tools (or GUIs) that work on groups

Editing Installed Groups

Beside installing groups from comps editing installed or newly create groups will become the most important operation. Do do this different interfaces are possible.

Pure Command line

A new set of commands do not just add and remove packages from the system but work on a group given as parameter. May be different names should be chosen to signal that the packages are not installed or removed directly but just added an removed from the group.

Interactive Command line

To do bigger changes the group could be represented as a text file that is opened in an editor and can be changed in there. After the editor is closed the changes are parsed, saved into the group and executed. This would work similar to the "git rebase -i" command. The packages would be represented in the file similar to this:

  • Packages installed by comps status (mandatory, default, optional): commented out line with prepended "-". Uncommenting it will exclude the package
  • Packages selected although the comps status is not installed: just the package name. Commenting or deleting the entry will unselect the package
  • Packages excluded as they would be installed by their comps status: "-" and package name. Commenting or removing this line would re-add the package to the group

A very small example group could look like this:

### Installed "mandatory" and "default" packages
### Uncomment to remove package
#-foo 
#-bar

### Optional packages
### Comment to remove; uncomment to add


Graphical

The UI for comps groups would pretty much look the same as right now. For standalone groups (and may be even comps groups) there'd be an interface to add new packages. Either by selecting packages from the global lis/tree and then press a "Add to group" button or with a drag and drop interface that shows the selected group content side by side to the available packages. This is left to the GUI designer.


An user created group would look very similar to an installed comps group except that it would not follow any external group. The idea is that users create groups for a given purpose. That way the groups can serve as documentation on what is installed and why. Typical groups could be a tool chain or devel packages for a project. Groups can and should overlap if necessary. So packages can be required from several groups.

Group commands

So the operation of the package manager look very different from yum's:

  • list [available]
  • info GROUP
    • give group details
  • info PACKAGE
    • show details for package
    • if installed explain why the package is installed (member/dependency of group(s))
  • install GROUP
    • GROUP can either be a group name or a file
    • if no cli param is given open an editor to select which optional packages to select and to add a comment
    • cli switch allows installing the group immediately (selecting mandatory, default or optional packages)
  • edit GROUP
    • opens editor allowing adding and removing packages (optional and arbitrary) or modifying the comment(s)
    • afterwards packages are added and removed to keep the system in sync with the group
    • there are sub command to alter the group from the cmd line
  • new GROUP [OTHERGROUP, ...]
    • create new group with the given name
    • opens editor if details are not given at the cmd line
    • can use other groups as template
  • remove GROUP

Graphical tools would do similar things by drag and dropping packages into new groups or checking boxes of optional packages very much similar to what we do today.

Dealing with Inconsistencies

The idea is that the installed packages do only depend on the groups installed (otherwise it'd be not sufficient to just look at the groups). This will require a slightly different implementation of updates. Packages no longer required need to be removed.

In addition an update mechanism for groups is needed. It is still open if comps groups should have a version number or if we can just have a hash or stamp that we compare with the version of the group we have installed. As it is possible that the group list do no longer match the packages in repositories there is a new class of errors. As we want automatic updated to continue working the implementation should handle them gracefully but keep track of them. All interactive tools should list such inconsistencies and ask the user to solve them.

If yum or rpm is still around it is possible that the installed packages do no longer match the groups. It is though important to detect that the package list was changed "behind the back". These chances need the to be integrated into the group world. Packages that are required by groups should be marked as deleted/deselected. New packages should be added to a "Unknown" group. Similar to other mismatch errors the user should be asked to clarify the situation.

Problematic situations:

  • Package listed in group does not exist in repository
    • may or may not be installed on the system
      • may be ignore if package was installed from file
  • Package selected in group is no longer in comps group (as an previously optional package)
    • keep entry but warn user
  • Package listed in group got deleted by hand
    • Remove from group/ Add deleted entry to avoid installing it again
  • New Package installed by hand
    • Add to dummy group
  • Package is obsoleted
    • Maintain an obsoleted list to translate entries in groups that did not yet do the transition

Conversion from yum

If a system already was installed with yum it has to be converted to the new group model. Matching the comps groups against the installed packages (see show-installed from yum-utils) can be a good starting point. Nevertheless such conversion can not be perfect as it cannot tell which package got installed by dependency and which on purpose.