Features/JigdoRelease

= Jigdo Release =

Summary
Jigdo is a distribution method for ISO images. A server hosts a number of files contained in an ISO image along with a .jigdo file and a template for each ISO image. A client downloads the template files and using the .jigdo, downloads the many small parts (possibly from many servers) that make up the final ISO image.

As the footprint for a Fedora Project mirror increases, Jigdo might resolve some of these problems by eliminating the need to host both the ISO (in ) and the expanded release tree (in  ).

Owner

 * Name: JeroenVanMeeuwen

Interested People

 * JonathanSteffan
 * RahulSundaram
 * JamesBenWilliams

Current status

 * Targeted release:  Fedora 9
 * Last updated: March 15th, 2008
 * Percentage of completion: 100%
 * Not 100% because it is not part of the release process yet.

Detailed Description
For more information on Jigdo, see it's home page at http://atterer.net/jigdo/.

The size of a DVD image is over 3.0 GB (Source DVD image), up to 3.9 GB of PPC, while an ISO DVD template for jigdo is approximately 23 MB. Decreasing the footprint of a release becomes more important as sizes of images increases (such as with supporting multiple architectures, or releasing multiple spins or multiple types of media -CD/DVD/Everything Spin?).

Jigdo scans the ISO image(s) for files (slices) that match certain files in a given tree, such as the expanded release tree. Each match becomes referenced in the .template for the ISO image and gets an entry in the .jigdo file users use to download the slices with. In the .jigdo file, you can specify a number of resources to get the slices from, such as the Fedora Project mirrorlist, a local (private) mirror or an already downloaded DVD image.

Right now the Fedora Project releases a DVD iso as well as a Rescue CD, 14 GB in total:

$ cd /data/fedora/releases/8/Fedora/ $ du -sch 3.7G   ./x86_64/iso/Fedora-8-x86_64-DVD.iso 104M   ./x86_64/iso/Fedora-8-x86_64-rescuecd.iso 3.0G   ./source/iso/Fedora-8-source-DVD.iso 3.2G   ./i386/iso/Fedora-8-i386-DVD.iso 103M   ./i386/iso/Fedora-8-i386-rescuecd.iso 135M   ./ppc/iso/Fedora-8-ppc-rescuecd.iso 3.9G   ./ppc/iso/Fedora-8-ppc-DVD.iso 14G    total

In addition, the expanded installation tree is published and mirrored, 8.8 GB in total (keeping in mind linked files between i386 and x86_64, but not files linked from Everything):

3.7G   ./x86_64/os 2.2G   ./i386/os 3.1G   ./ppc/os 8.8G   total

This is 22.8GB of ISO images and expanded installation trees in total, per release. The expanded installation trees (8.8GB) with one overall .jigdo file, and .template files for each ISO (22MB each, approximately 154MB in total), would, using the expanded installation tree as a source, decrease the amount of data needing to be hosted and mirrored drastically.

Current Jigdo Restraints

 * Users need to use jigdo-lite as the jigdo client currently is unable to parse .jigdo files that contain multiple images. As jigdo-lite is a console based program this may form some barrier to download the images.
 * A workaround for this may be to use separate .jigdo files per ISO image.
 * Another way around this constraint is to extend the Jigdo GUI to be able to parse .jigdo files with multiple images. Upstream has been contacted about this.


 * The current jigdo client is unable to thread a number of requests or downloads in parallel. In case of many small files (such as the  directory or a lot of small packages)
 * Presumable libcurl could provide HTTP/FTP pipelining.


 * The current jigdo-file template creation program is unable to integrate a mirror list as a location into the [Servers] section of the .jigdo file. This isn't a show-stopper but requires manually adding the [Servers]  section into the .jigdo file.
 * Upstream has provided a [Include] type of section for mirrorlists -Debian Style. This presumes the mirrors returned are 'weighted', and again are only valid for single ISO image .jigdo files. We've proposed a solution to upstream.


 * While jigdo-lite uses wget having slices with the '+' character in the filename poses a problem when the [Servers] section makes use of the Fedora Project mirrorlist. This is resolved by replacing all '+' characters in the .jigdo file for '%2b'.
 * Not sure this is actually a jigdo bug or a mirror manager bug, or a HTTP/FTP feature ;-)


 * The result returned by the mirrorlist is not being cached in any way. This causes for sub-optimal use of the mirrorlist as for every slice that needs to be obtained, the mirrorlist is queried. Presumably, during release time, this could overload MirrorManager as well.
 * Again, we've contacted upstream and provided the proposed feature for mirrorlists gets in, this problem could be eliminated as well.

Issues That pyJigdo Solves

 * Mirror Manager results are cached. Two queries are made per ISO image. One for the localized mirror list (as during normal use), and one for the global mirror list in order to provide some fall-back.
 * pyJigdo will get a GUI before F9T1, which is useful for Linux clients.
 * Note that pyJigdo is not available for other Linux distributions (as of yet). It is not available for Windows Clients either (as of yet).
 * Note: The GUI has not made it yet
 * During compose time, pyJigdo is able to perform batch-processing so that it creates .jigdo and .template files over a number of ISOs for a number of architectures, so that these files do not have to be created running seperate jigdo-file make-template commands.

Benefit to Fedora
Besides potentially decreasing the footprint of a Fedora release (the original ISO would not need to be mirrored), Jigdo facilitates having multiple architectures, multiple types of media, and can make optimal use of the Fedora Project mirrorlist (by redirecting the users to a local mirror or even a private mirror should they have one).

Using Jigdo, releasing an "Everything Spin" would consume approximately ~30 MB of .jigdo and .template files with the bit-extracted slices already being in the  tree and expanded installation tree at  ).

Custom Installation Media Spins (such as with respins, rebranded or remixed content) would be able to use the same distribution method relying on the Fedora Project infrastructure without needing the Fedora Project to actually add the distribution to it's mirrors, BitTorrent seed/tracker or web pages.

Scope

 * 1) Integrate the Jigdo into the Release Process.
 * 2) Getting the distribution via Jigdo advertised on /get-fedora or spins.fedoraproject.org.

About 2), if 1) isn't met; not integrating the jigdo generation into the release process would mean the jigdo is available shortly after release (~4 hours).

Test Plan
1. Compose CD images from the officially released Fedora 8 DVD and release them over Jigdo (DONE, http://fedoraunity.org/news-archives/fedora-8-cd-sets-released) These CD images have been composed against the  expanded installation tree using the following command: $ jigdo-file make-template \ --image=/data/fedora/spins/Fedora-8-i386-CD/Fedora-8-i386-CD1.iso \             # The input ISO image file /data/fedora/releases/8/Fedora/i386/os/ \                                       # The tree to scan for files (slices) --label Base-i386=/data/fedora/releases/8/Fedora/i386/os/ \                     # The label for the tree to scan --jigdo=/var/www/jigdo/templates/Fedora-8-CD/Fedora-8-CD.jigdo \                # The output jigdo file --template=/var/www/jigdo/templates/Fedora-8-CD/Fedora-8-i386-CD1.iso.template \ # The output jigdo template for this ISO image --no-servers-section \                                                          # No servers section as we specify this later on in the .jigdo file --force \                                                                       # Force --merge=/var/www/jigdo/templates/Fedora-8-CD/Fedora-8-CD.jigdo \                # Merge with the existing .jigdo so one .jigdo file can \                                                                               # re-compose multiple ISO images --cache=/var/tmp/Fedora-8-CD.cache                                              # Cache what slices we've come across 1. For further notes, see also http://kanarip.blogspot.com/2007/11/lessons-learned.html Summarizing:
 * The '+' character in filenames (mostly RPM filenames) confuses wget/jigdo in combination with MirrorManager and need to be converted to '%2b' in the jigdo file.
 * It is best to generate the i386 ISO templates first
 * Templates should not be generated against a loop-mounted ISO.

User Experience

 * The expanded  directory on an ISO image contains many, many very small files which creates an overhead in downloading by invoking wget for each file. This could be resolved by excluding the   directory when composing the templates so that it ends up in the template rather then in the .jigdo as slices to be downloaded. This greatly improves the end-user experience.

Dependencies

 * Software dependencies:
 * Jigdo


 * Getting accurate numbers about how many users would choose Jigdo as the preferred method of downloading Fedora can only be done if the method is advertised (on /get-fedora for example).

Contingency Plan

 * There's no "should this not work, then what do we do" to this feature. Given the last couple of Fedora Unity releases we know what to expect and what the caveats are.

Documentation

 * Documentation is upstream, http://atterer.net/jigdo/, on the local system as well as on http://fedorasolved.org/post-install-solutions/jigdo/

Release Notes
"""The Fedora Project is proud to announce it's releases are now also available via Jigdo. This particular distribution method has been used by the Debian distribution for a long, long time and could improve the speed at which you can obtain the installation ISO images. Instead of waiting for torrent downloads to complete, Jigdo seeks the fastest mirrors it can find via the Fedora Project Mirror Manager infrastructure, downloading the bits it needs from these mirrors. To optimize seeking these bits you can tell it to scan a DVD or CD you already have and prevent bits from being downloaded off the net, twice. This feature becomes particularly useful if you either


 * 1) Download all the test releases and then get the final release. You have 90% of the data already with each subsequent download.
 * 2) Download both the DVD and the CD set. The DVD holds 95% of the data on you need for the CD sets.
 * 3) A combination of the above.

Questions
JesseKeating:
 * Can we get a clear idea of how and where this would interact with our compose tools, perhaps with patches?
 * JeroenVanMeeuwen: Jesse, you suggested on IRC it'd be a separate script. [[Image:Features_JigdoRelease_jigdofy_release-0.1.sh]] Here's an example] I'd like you to comment on.
 * JeroenVanMeeuwen: Seems Jesse figured it out (?)
 * Where do the templates go, and are they mirrored?
 * JeroenVanMeeuwen: Really up to release engineering / mirror wranglers / ..., the templates could go on the mirrors as well as just onto spins.fedoraproject.org. For rather unique (Fedora 8 CD, Fedora 8 Re-Spins) and moderately advertised spins, expect ~5000 downloads over a 3 month period.
 * JeroenVanMeeuwen: Seems they end up in Jigdo
 * How will these be advertised to users, maybe a patch to the get-fedora page?
 * JeroenVanMeeuwen: I'm not sure I know where I can find the source of that page nor how I can patch it, will look into it.
 * What are the verification processes to make sure the templates are sane after the compose?
 * JeroenVanMeeuwen: I'm not sure if I understand this one right. Re-composing the ISO images from the templates and .jigdo's is one thing, the SHA1SUM should match the SHA1SUM of the original ISO image... But we're probably looking for a more efficient way, right?
 * Can we get all this done before Beta, so that we have a chance of testing our processes/proceedures for a release or two before the final?
 * JeroenVanMeeuwen: I consider this done before Beta or suffering some serious amount of feedback on bullets 1-3