Summer Coding 2010 proposal - Interactive SPEC pages

Description
Spec files are the backbone of the Fedora operating system and improving the process by which they are created will have direct consequences in the general quality of the Fedora OS. This proposal aims at making access to SPEC files a lot more open in order to allow learning from current SPEC files and to encourage comments and modification throughout the lifetime of a Fedora package.

SPEC searching
There are many techniques and tricks that experienced Fedora developers use in their packages that are never documented. Some of these are applicable to new packages as well, or if not, provide insights into how different packaging issues are dealt with. Right now, it is very difficult to conduct a search in the whole of Fedora's specfiles. Doing so, requires to know exactly what you're looking for, where in the spec file it is located and then browsing through CVSWeb on a package that one thinks *might* have dealt with a similar issue.

Having all the specfiles indexed and searchable in the pkgdb could greatly improve the effort required to learn from other people's work. Functionality would also be added to allow searching in different sections of a SPEC file (%pre, %build, %files etc.).

SPEC comments
The SPEC file page will no longer be just a static page, viewable through CVSWeb, but an AJAX enabled page providing a lot of interaction. The page would integrate code review functionality similar to the one used in the DjangoBook's commenting system. The page would contain comment bubbles on the side of the page, which would open in bigger discussion-bubbles. It would also contain links to different sections for easy reference from across Fedora's infrastructure.

Make it easy to open bugzilla tickets on specific problems with the specfile. One other feature will be attaching VCS revisions and line numbers to the comments so that they can disappear/get grayed out when the latest version of the SPEC file doesn't contain the issue anymore and also make referring to earlier versions possible.

Searching and commenting on spec files would make it easier to improve old packages which are no longer in sync with the Guidelines, thus extending the package review after the package has been imported in cvs. It could prove an easy job for new contributors or people participating in a bugsquashing day at an event for example to just pick an issue like: usage of %define instead of %global in the definition of the python_sitelib macro, inclusion of the full license text file in the %doc section and then comment on the spec page to ask the maintainer to fix the problem.

Package review features
This functionality would be great to integrate into our package review process. Reviewers could easily point out issues on the spec file and leave red bubbles behind. The packager would then have to solve these issues and green the bubbles on her next iteration.

This might prove really helpful with tricky packages with a lot of comments and a lot of different people commenting. Instead of the simple list of bugzilla comments, we could switch to red bubbles open by each reviewer and that way anyone could get a quick view of what the remaining issues with the package still are without having to read all the 42 comments in the bug report.

To make the package review easier, there will be a check-list of all the Package Review MUST and SHOULD items that will have to be checked in order for the package to be accepted. This aims to replace the ASCII-art templates that are currently being used.

It would be great if different reviewers could check just some of these check-boxes and leave the easy ones for proper consideration. Some of the MUST items are really easy to check, like the ones that don't apply to the current package, or the American English requirement for example. They could be done in 10 minutes by someone who's in a hurry or not very experienced and push the package a long way through the review process.

The technologies used are the same that are currently being used by pkgdb: Turbogears/PostgreSQL/dojo

SPEC indexing/searching

 * 1) (.5 weeks) write the database schema to support adding specfiles to pkgdb. Decide on a standard number of attributes that will match the specfile sections.
 * 2) (1 week) write a script to crawl all the cvs specfiles we currently have and get them in the db. Write something like a cvs post-commit hook to update the pkgdb when a new change is made to a spec file.


 * 1) (1 week) Implement full-text search on the specfiles that can output line numbers. (maybe go back to 1 and put line numbers into the db)
 * 2) (.5weeks) Write WebUI for search.

SPEC web page

 * 1) (1 week) write the UI to have all of the different sections from the db united to form the original SPEC file with nice coloring (syntax highlighting?) and line numbers.
 * 2) (.5week) Database schema for specfile-comments (tie comment to line-number)
 * 3) (1 week) Write WebUI for commenting on specfiles
 * 4) (.5 weeks) Integration with bugzilla - some commenters can choose to open a bugzilla ticket and then the discussion continues there. We won't follow the discussion anymore, just link to it and maybe get the status of the ticket.
 * 5) (1 week) Tie comment threads to specific revisions/line numbers in the VCS so that they can be grayed out when those lines have changed or disappear.

Package Review

 * 1) (.5weeks) find a way to store comments/specfiles etc. for packages we don't have in the db yet (build a new not-yet-packaged table maybe)
 * 2) (1 week) Make a list of MUST items in the database with links to the actual guideline paragraph, make each comment a possible link to one of the guidelines.
 * 3) (2 weeks) Make a list of MUST items somewhere on the SPEC page if the package is not yet in the pkgdb that can get checked individually by different reviewers. (Maybe build db schema to allow tracking of responsibility for checked items - who checked what)

Total: 10.5 weeks

Convincing
I've been a Fedora Project contributor since May 2008 when I began working on the Fedora PackageDB with Toshio Kuratomi. I worked on small features, visible to the general Fedora user. One of the features I'm most proud of that I worked on in that period was an advanced package search functionality.

Last summer I participated in GSOC working on a bunch of features for the pkgdb aimed at making it more user-centric, rather than developer-centric. These included importing metadata from yum, a new package page with commenting and tags and exporting those tags for future use in yum via sqlite. This work was later adapted and included into the 0.5.x series of the PackageDB after some design changes to PackageDB itself.

At the end of the summer I began contributing to the Fedora Project as a Package Maintainer and have contributed this way since then. I've also begun a personal project of a Romanian FOSS advertising network. More info on the project on its webpage (Romanian) or on github (code).

Impact
If your project is successfully completed, what will its impact be on the Fedora community? Give 3 answers, each 1-3 paragraphs in length. The first one should be yours. The other two should be answers from members of the Fedora community, at least one of whom should be a Fedora Summer Coding mentor. Provide email contact information for non-Summer Coding mentors.

Ionuț's answer
The Fedora community will have an improved way of handling package reviewing. Since Packaging is a fundamental part of the Fedora Project and the package review process is the most important way through which packages get accepted in Fedora, the impact will be high. Though only immediately visible by the Fedora developers, this project ultimately aims at improving the general quality of Fedora packages by making it easier for developers to review each other's work before and after the package has been accepted in the official repositories.

answer 2
If implemented, this feature would add a good deal of data that would help packagers. Our spec files contain many examples of good practice and bad practice but telling the difference between the two often requires extensive research on multiple forums (reading specs in cvs, finding relevant guidelines on the wiki, and asking questions on the mailing list). Having annotated, commentable spec files available online will let people search for examples and see whether other packagers think that the practices are good or bad.

Pushing review functionality into the packagedb has been talked about a lot but until now, no one has really thought about how we'd design something to do that. If this gets done, it would give us a lot of things that bugzilla doesn't. For instance, processing cvsadmin tasks would take less time since all the information would be available in the packagedb.

These feature also provides us with the ability to expand the packagedb in several other directions. Having our own spec files displayed in the interface opens the door to displaying spec files and patches from other distributions, linking to upstream bug trackers and source control, and other things that relate to the development of packages.

Common Questions to think about

 * 1) What will you do if you get stuck on your project and your mentor isn't around?
 * 2) * I generally find my way out of most technical problems by googling, going deeper into the source code or just asking on the relevant IRC channel (#turbogears, #python, #yum etc.)
 * 3) *For design problems, though these are less likely to happen, I know my way around the community mailing lists and IRC channels and generally know who's who. If my mentor will be missing for several days I'll probably start asking for advice on #fedora-admin and #fedora-devel.
 * 4) In addition to the required blogging minimum of twice per week, how do you propose to keep the community informed of your progress and any problems or questions you might have over the course of the project?
 * 5) *I'm constantly lurking on IRC in #fedora-devel and #fedora-admin and am also subscribed to the relevant mailing lists.

Miscellaneous

 * 1) We want to make sure that you are prepared before the project starts
 * 2) * Can you set up an appropriate development environment?
 * 3) **sure
 * 4) * Have you met your proposed mentor and members of the associated community?
 * 5) **Yes, I've worked with my mentor before and have been part of the Fedora community for 2 years.
 * 6) What is your t-shirt size?
 * M


 * 1) Is there anything else we should have asked you or anything else that we should know that might make us like you or your project more?
 * 2) * My favorite color is green.