User:Poelstra/ImproveRawhideF10

Background

 * A few of us got together at FUDCon Boston 2008 to delve more into how to make rawhide better in response to a mail thread I created called What is Rawhide For?
 * Jesse Keating was our excellent discussion facilitator
 * Below are the ideas and actions we discussed

Brain Stormers

 * Jesse Keating, Will Woods, James Laska, John Poelstra, Peter Jones, Chuck Anderson, Bill Nottingham
 * Add your name if I missed you

Overview of Issues and Solutions Discussed:
 * Finding a way to verify that a rawhide tree is internally consistent
 * Maintaining more than one day of rawhide in the Fedora infrastructure
 * Saving "last known" good rawhide trees

Getting a Complete Tree
Problem: Consumers of rawhide need a way to know that they have the right/correct/complete/consistent tree for a particular date. Every composed tree has a  file. The presence of this file does not guarantee that all the other associated packages for that compose are present too

Decision tree for people wanting to install rawhide

 * 1) is today's tree on the mirror?
 * 2) Is today's tree worth getting? In other words, "does it have a chance of installing?"
 * 3) If #1 and #2 are "yes" then #4; else WaitForTomorrow
 * 4) Synchronize local copy with mirror
 * 5) Is the tree I have locally internally consistent and match up with .treeinfo

Best solution going forward

 * 1) Verify date in   (today's date or --date)
 * 2) verify checksums of repodata.xml (whatever runs last)
 * 3) change pungi to create checksum of repodata.xml and add to .treeinfo
 * 4) change pungi to create checksum of kernel, initrd and add to .treeinfo
 * 5) verify contents of repomd.xml (slow)
 * 6) verify packages based on repodata (very slow)


 * could #3 and #4 be part of yum-utils?
 * Update: Seth Vidal created this to verify trees for internal consistency: http://skvidal.fedorapeople.org/misc/verifytree.py

Open Items

 * 1) Need a script or tool written to address solutions steps: 2,3, and 4

Multiple Days of Rawhide

 * Going back to a previous release of rawhide
 * useful for reproducing a bug on Day5 when you reported the bug on Day0 and the composition of rawhide has since changed

Possible solutions

 * stage2
 * save complete old trees (optimistically hard linking)
 * served by machine not the hub (kojikpgs) (5 days worth)?
 * provide new tool to home users
 * repo of boot.iso's
 * git repos of "stuff"
 * meta redirects to kojifile store
 * copy of non package data

Consensus on best go-forward plan
Create a new tool:
 * used on the mirrors and by community members to run locally (aka rsync -H --link-dest)
 * proposed name of "tree-hugger"
 * keeps n-copies of tree
 * hard links for space saving or slinks for file systems links AFS
 * number of trees based on size or number (keep as much as we can)
 * mirror list (FIXME---what else went here?)

Update
Archive of rawhide trees now made available by non-koji hub system: http://kojipkgs.fedoraproject.org/mash/

Last Known Good Tree

 * Is there a way to provide "last known good" trees so that if a particular day of rawhide does not install or a current installation becomes unusable there is a clear place to obtain a "known installable tree"?

Important Rawhide Distinctions
Very important and different ways rawhide is used:
 * 1) Rawhide as a repo of packages
 * 2) Rawhide as an installable distribution

Open questions and known issues

 * 1) Can we re-validate composes?
 * 2) What is our definition of "good"?
 * 3) There are no current checks performed before trees go to mirrors
 * 4) Could we fix the  problem by performing tests and if the tree is "not "good it doesn't go to the mirror?
 * 5) How do we determine what is "good enough" to push, but not "good enough" to  tag as "good"?
 * 6) What about providing more snapshots?
 * 7) Can we do better notification of snapshots to maintainers so that they "land" AND "park" content for snapshots far enough in advance versus "crashing into the runway" at the last minute.
 * 8) How about about a generic tool that combines livecd-iso-to-disk and diskboot.img?

Snapshots
In F9 we had the following snapshots:
 * Alpha
 * Beta
 * Snapshot 1
 * Snapshot 2
 * Snapshot 3
 * Preview
 * RCX

Other thoughts:
 * We tend to get the most feedback when we do ISO releases
 * Can we determine how many people use bittorrent vs. mirror downloads particularly for to download the test releases?
 * ACTION: File a ticket with infrastructure and Mike McGrath will get us data supporting or disproving the assertion that making snapshots available on bittorrent should satisfy most of the demand for snapshot access.

Possible action plan

 * 1) Create more visibility that snaps will be created
 * 2) Post testopia results for snaps
 * 3) Create more automation as we go
 * 4) Mirrored snapshots for full availability and testing
 * 5) Include smaller package set?
 * 6) Create snapshots more often (every week?)
 * 7) Good install trees or snapshots are named by their creation date or milestone

Defining GOOD
Here is what we think Good should mean in the following situations and how to arrive there.
 * If you don't make it to the last step for a particular context (repo, install source, snapshot, major milestone)
 * If it isn't Good it doesn't get pushed to the respective public hosting space

rawhide a REPO

 * 1) repodata
 * 2) enough packages (multilib)
 * 3) key packages (kernel, glibc, rm)
 * 4) non-insane broken deps
 * 5) has a valid comps.xml

rawhide as an INSTALL SOURCE

 * 1) repodata
 * 2) enough packages (multilib, not missing a complete arch)
 * 3) key packages (kernel, glibc, rm)
 * 4) non-insane broken deps
 * 5) has a valid comps.xml
 * 6) has complete images
 * 7) boots
 * 8) gets to stage 2 of anaconda--the greeting message
 * 9) testopia rawhide validation set

rawhide as a SNAPSHOT (last known good)

 * 1) repodata
 * 2) enough packages (multilib, not missing a complete arch)
 * 3) key packages (kernel, glibc, rm)
 * 4) non-insane broken deps
 * 5) has a valid comps.xml
 * 6) has complete images
 * 7) boots
 * 8) gets to stage 2 of anaconda--the greeting message
 * 9) testopia rawhide validation set
 * 10) Has ISOs of proper size (this should be an automated check)
 * 11) run snapshot testopia validation suite

rawhide as a MAJOR MILESTONE

 * Alpha
 * Beta
 * Preview
 * Release Candidate


 * 1) repodata
 * 2) enough packages (multilib, not missing a complete arch)
 * 3) key packages (kernel, glibc, rm)
 * 4) non-insane broken deps
 * 5) has a valid comps.xml
 * 6) has complete images
 * 7) boots
 * 8) gets to stage 2 of anaconda--the greeting message
 * 9) testopia rawhide validation set
 * 10) Has ISOs of proper size (this should be an automated check)
 * 11) run milestone testopia validation suite