Summer Coding 2010 proposal - Interactive SPEC pages: Difference between revisions

Revision as of 17:48, 10 April 2010

This is a WIP and not ready for the public yet ;)

About me

What is your name?
- Ionuț Arțăriși
What is your email address?
- mapleoin@fedoraproject.org
What is your wiki username?
- Mapleoin
What is your IRC nickname?
- maploin
What is your primary language? (We have mentors who may speak your preferred languages and can match you with one of them if you request.)
- Romanian. English is fine, thank you.
Where are you located, and what hours do you tend to work? (We also try to match mentors by general time zone if possible.)
- Romania. My working hours vary a lot. Anytime between 6AM and 12PM GMT.
Have you participated in an open-source project before? If so, please send us URLs to your profile pages for those projects, or some other demonstration of the work that you have done in open-source. If not, why do you want to work on an open-source project this summer?
- User:Mapleoin
- http://github.com/mapleoin
Bonus level: What's your schedule and why?
- I'm a terminal year college student in Romania where school officially ends in July. Since this is also the year when I have to write my thesis I'd like to deviate from the standard Summer Coding schedule. My first term will be the standard second term and I'll add another term starting from the standard midterm evaluation date. I'm aligning with a few standard dates to try and keep things as simple as possible for others. I can and will probably start working a bit earlier. So it would look like this (11 weeks in total):
  1. First term - 12 July - 16 August (my midterm evaluation date)
  2. Second term - 17 August - 27 September (student final report etc.)

Interactive SPEC pages

Description

Spec files are the backbone of the Fedora operating system and improving the process by which they are created will have direct consequences in the general quality of the Fedora OS. This proposal aims at making access to SPEC files a lot more open in order to allow learning from current SPEC files and to encourage comments and modification throughout the lifetime of a Fedora package.

SPEC searching

There are many techniques and tricks that experienced Fedora developers use in their packages that are never documented. Some of these are applicable to new packages as well, or if not, provide insights into how different packaging issues are dealt with. Right now, it is very difficult to conduct a search in the whole of Fedora's specfiles. Doing so, requires to know exactly what you're looking for, where in the spec file it is located and then browsing through CVSWeb on a package that one thinks *might* have dealt with a similar issue.

Having all the specfiles indexed and searchable in the pkgdb could greatly improve the effort required to learn from other people's work. Functionality would also be added to allow searching in different sections of a SPEC file (%pre, %build, %files etc.).

SPEC comments

The SPEC file page will no longer be just a static page, viewable through CVSWeb, but an AJAX enabled page providing a lot of interaction. The page would integrate code review functionality similar to the one used in the DjangoBook's commenting system. The page would contain comment bubbles on the side of the page, which would open in bigger discussion-bubbles. It would also contain links to different sections for easy reference from across Fedora's infrastructure.

Comments made by Fedora packagers can be flagged as issues with the package and automatically open bugzilla tickets that need to be fixed by the maintainer.

Searching and commenting on spec files would make it easier to improve old packages which are no longer in sync with the Guidelines, thus extending the package review after the package has been imported in cvs. It could prove an easy job for new contributors or people participating in a bugsquashing day at an event for example to just pick an issue like: usage of %define instead of %global in the definition of the python_sitelib macro, inclusion of the full license text file in the %doc section and then comment on the spec page to ask the maintainer to fix the problem.

Package review features

This functionality would be great to integrate into our package review process. Reviewers could easily point out issues on the spec file and leave red bubbles behind. The packager would then have to solve these issues and green the bubbles on her next iteration.

This might prove really helpful with tricky packages with a lot of comments and a lot of different people commenting. Instead of the simple list of bugzilla comments, we could switch to red bubbles open by each reviewer and that way anyone could get a quick view of what the remaining issues with the package still are without having to read all the 42 comments in the bug report.

To make the package review easier, there will be a check-list of all the Package Review MUST and SHOULD items that will have to be checked in order for the package to be accepted. This aims to replace the ASCII-art templates that are currently being used.

It would be great if different reviewers could check just some of these check-boxes and leave the easy ones for proper consideration. Some of the MUST items are really easy to check, like the ones that don't apply to the current package, or the American English requirement for example. They could be done in 10 minutes by someone who's in a hurry or not very experienced and push the package a long way through the review process.

The technologies used are the same that are currently being used by pkgdb: Turbogears/PostgreSQL/dojo

Timeline

SPEC indexing/searching

(.5 weeks) write the database schema to support adding specfiles to pkgdb. Decide on a standard number of attributes that will match the specfile sections.
(1 week) write a script to crawl all the cvs specfiles we currently have and get them in the db. Write a cvs post-commit hook to update the pkgdb when a new change is made to a spec file. Q: should we prepare for git? I think so -- need to talk to jkeating to see about timeframe but I think it's F14/15 material so we want to be ready. Also: we want to make sure hooks don't take so long to run that they interfere with maintainer workflow.
(1 week) Implement full-text search on the specfiles that can output line numbers. (maybe go back to 1. to put line numbers into the db?)
(.5weeks) Write WebUI for search.

SPEC web page

(1 week) write the UI to have all of the different sections from the db united to form the original SPEC file with nice coloring (syntax highlighting?) and line numbers.
(.5week) Database schema for specfile-comments (tie comment to line-number)
(1 week) Write WebUI for commenting on specfiles
(1.5 weeks) Integration with bugzilla - comments opened (by a privileged group like 'packager' or should the privileged group be the only ones allowed to comment?) should open a bugzilla ticket, and closed tickets should color the comment gray and make it unwriteable (make a new one or reopen).

Package Review

(.5weeks) find a way to store comments/specfiles etc. for packages we don't have in the db yet (build a new not-yet-packaged table?)
(1 week) Make a list of MUST items in the database with links to the actual guideline paragraph, make each comment a possible link to one of the guidelines. (Q: Is there a way to guarantee that the pkgdb MUSTs and guideline links are updated every time the wiki pages are updated?)
(2 weeks) Make a list of MUST items somewhere on the SPEC page if the package is not yet in the pkgdb that can get checked individually by different reviewers. (Build db schema to allow tracking of responsibility for checked items? who checked what)

Total: 10.5 weeks

Convincing

I've been a Fedora Project contributor since May 2008 when I began working on the Fedora PackageDB with Toshio Kuratomi. I worked on small features, visible to the general Fedora user. One of the features I'm most proud of that I worked on in that period was an advanced package search functionality.

Last summer I participated in GSOC working on a bunch of features for the pkgdb aimed at making it more user-centric, rather than developer-centric. These included importing metadata from yum, a new package page with commenting and tags and exporting those tags for future use in yum via sqlite. This work was later adapted and included into the 0.5.x series of the PackageDB after some design changes to PackageDB itself.

At the end of the summer I began contributing to the Fedora Project as a Package Maintainer and have contributed this way since then. I've also begun a personal project of a Romanian FOSS advertising network. More info on the project on its webpage (Romanian) or on github (code).

Me and the community

Impact

If your project is successfully completed, what will its impact be on the Fedora community? Give 3 answers, each 1-3 paragraphs in length. The first one should be yours. The other two should be answers from members of the Fedora community, at least one of whom should be a Fedora Summer Coding mentor. Provide email contact information for non-Summer Coding mentors.

Ionuț's answer

The Fedora community will have an improved way of handling package reviewing. Since Packaging is a fundamental part of the Fedora Project and the package review process is the most important way through which packages get accepted in Fedora, the impact will be high. Though only immediately visible by the Fedora developers, this project ultimately aims at improving the general quality of Fedora packages by making it easier for developers to review each other's work before and after the package has been accepted in the official repositories.

answer 2

answer 3

What will you do if you get stuck on your project and your mentor isn't around?
- I generally find my way out of most technical problems by googling, going deeper into the source code or just asking on the relevant IRC channel (#turbogears, #python, #yum etc.)
- For design problems, though these are less likely to happen, I know my way around the community mailing lists and IRC channels and generally know who's who. If my mentor will be missing for several days I'll probably start asking for advice on #fedora-admin and #fedora-devel.
In addition to the required blogging minimum of twice per week, how do you propose to keep the community informed of your progress and any problems or questions you might have over the course of the project?
- I'm constantly lurking on IRC in #fedora-devel and #fedora-admin and am also subscribed to the relevant mailing lists.

Miscellaneous

We want to make sure that you are prepared before the project starts
- Can you set up an appropriate development environment?
  - sure
- Have you met your proposed mentor and members of the associated community?
  - Yes, I've worked with my mentor before and have been part of the Fedora community for 2 years.
What is your t-shirt size?
- M
Describe a great learning experience you had as a child.
Is there anything else we should have asked you or anything else that we should know that might make us like you or your project more?

Note: you will post this application on the wiki in the category Category:Summer Coding 2010 applications. We encourage you to browse this category and comment on the talk page of other applications. Also, others' comments and your responses on the talk page of your own application are viewed favorably, and, while we don't like repetitive spam, we welcome honest questions and discussion of your project idea on the mailing list and/or IRC.

The NeL project has some good general recommendations for writing proposals. We encourage Summer Coding code to include tests.

@@ Line 10: / Line 10: @@
 === Description ===
-# Describe your project in 10-20 sentences. What are you making? Who are you making it for, and why do they need it? What technologies (programming languages, etc.) will you be using?
+Spec files are the backbone of the Fedora operating system and improving the process by which they are created will have direct consequences in the general quality of the Fedora OS. This proposal aims at making access to SPEC files a lot more open in order to allow learning from current SPEC files and to encourage comments and modification throughout the lifetime of a Fedora package.
+==== SPEC searching ====
+There are many techniques and tricks that experienced Fedora developers use in their packages that are never documented. Some of these are applicable to new packages as well, or if not, provide insights into how different packaging issues are dealt with. Right now, it is very difficult to conduct a search in the whole of Fedora's specfiles. Doing so, requires to know exactly what you're looking for, where in the spec file it is located and then browsing through CVSWeb on a package that one thinks *might* have dealt with a similar issue.
+Having all the specfiles indexed and searchable in the pkgdb could greatly improve the effort required to learn from other people's work. Functionality would also be added to allow searching in different sections of a SPEC file (%pre, %build, %files etc.).
+==== SPEC comments ====
+The SPEC file page will no longer be just a static page, viewable through CVSWeb, but an AJAX enabled page providing a lot of interaction. The page would integrate code review functionality similar to the one used in the [http://djangobook.com/en/2.0/chapter02/ DjangoBook]'s commenting system. The page would contain comment bubbles on the side of the page, which would open in bigger discussion-bubbles. It would also contain links to different sections for easy reference from across Fedora's infrastructure.
+Comments made by Fedora packagers can be flagged as issues with the package and automatically open bugzilla tickets that need to be fixed by the maintainer.
+Searching and commenting on spec files would make it easier to improve old packages which are no longer in sync with the Guidelines, thus extending the package review after the package has been imported in cvs. It could prove an easy job for new contributors or people participating in a bugsquashing day at an event for example to just pick an issue like: usage of %define instead of %global in the definition of the python_sitelib macro, inclusion of the full license text file in the %doc section and then comment on the spec page to ask the maintainer to fix the problem.
+==== Package review features ====
+This functionality would be great to integrate into our package review process. Reviewers could easily point out issues on the spec file and leave red bubbles behind. The packager would then have to solve these issues and green the bubbles on her next iteration.
+This might prove really helpful with tricky packages with a lot of comments and a lot of different people commenting. Instead of the simple list of bugzilla comments, we could switch to red bubbles open by each reviewer and that way anyone could get a quick view of what the remaining issues with the package still are without having to read all the 42 comments in the bug report.
+To make the package review easier, there will be a check-list of all the Package Review MUST and SHOULD items that will have to be checked in order for the package to be accepted. This aims to replace the ASCII-art templates that are currently being used.
+It would be great if different reviewers could check just some of these check-boxes and leave the easy ones for proper consideration. Some of the MUST items are really easy to check, like the ones that don't apply to the current package, or the American English requirement for example. They could be done in 10 minutes by someone who's in a hurry or not very experienced and push the package a long way through the review process.
+The technologies used are the same that are currently being used by pkgdb: Turbogears/PostgreSQL/dojo
 === Timeline ===