SummerOfCode/2007/DimitrisGlezos

From FedoraProject

< SummerOfCode | 2007
Revision as of 14:28, 13 July 2009 by Raven (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

An upstream-friendly l10n Web UI for Fedora

  • Student: DimitrisGlezos
  • Mentor: KarstenWade
Note.png
A notepad/scrapbook of this work in progress can be found at: ./Notes

Abstract

The most important process of the Fedora Localization Project is the translation of Fedora resources (applications, documentation, websites, etc.) in various languages. Contributors identify a resource that needs translation, receive the respective PO-file, translate it, and commit the changes to our Source Code Management (SCM) system.

The system currently used allows translations of resources hosted on the same system but not on other systems, such as the main SCM of Fedora or remote SCMs.

Our goal is to build a platform that will facilitate localization processes of the Fedora L10N Project and other l10n communities. I will pursue this by deploying a Web User Interface for translation statistics which will also give translators seamless access to upstream-hosted translation files. This automation will benefit both the Fedora L10N Project in terms of usability and the upstream projects in terms of translation completeness.

Detailed Description

Background

Localization (l10n) is an important process of any project targeting a wide audience. It is the process where the resource (either is is software, documentation, a service or a website) is adapted to a locality. The most common form of localization is the translation of the software user interface (UI) or documentation content to alternative languages.

In the world of open source software and documentation, the most popular library that makes l10n possible is gettext. Coders identify translatable content in their code and extract it into special PO-files which are hosted along with the main code. Each language team is given access to and maintains the particular resource's translations. Usually code maintainers need to overview the progress of many translations and translation teams maintain translations for many components and projects. To ease management, the community is usually provided with a translation statistics interface.

Because translations are usually hosted with the source code, every translator is required to request access to every module on the SCM of every project he/she would like to contribute to. This is an unnecessary and time-consuming step for translators willing to contribute to projects hosted outside of their own SCM.

Objectives

This proposal is concerned with the following two objectives:

  1. Deploy an interface providing translation statistics for Fedora
  2. Provide a tool that enables seamless translation contributions to upstream projects

First of all, we would like to capitalize on already established projects and adopt them for our case instead of building everything from scratch. Also, the use of Python is a major plus, since it is the language of choice for the Fedora Project contributors.

Fedora currently uses a system provided by Red Hat for translation statistics. This system does not provide statistics for software hosted on Fedora's development server and furthermore is written in Perl, so we cannot re-use it. Our plan is to adopt Damned Lies (DL), a tool designed to provide translation statistics for GNOME. DL is written in Python and supports multiple languages, modules and release tags and could be used to give Fedora a Web User Interface to translations.

We also plan to establish a system which will help the Fedora L10N community to maintain translations and make it possible for it to "act as upstream" for remotely-hosted PO-files. We plan on deploying a tool which will act as a client to various SCMs of upstream projects: it will handle the authentication, checkout and checkin of the file for the translator.

Importance

The Fedora L10N community consists of more than 2000 contributors who provide on average, 1100+ commits every month for more than 84 languages. Their job is not easy: they manage several software translators for Fedora, Fedora Documentation, Fedora software hosted outside our main source control management (SCM) system and completely remotely hosted Fedora-related software.

Most l10n communities face the same problem. The Ubuntu project uses Rosetta, a web-based translation tool currently not available as free software. The limitation of this tool (besides not being free) is that translations are not hosted upstream but rather maintained especially for Ubuntu. Language maintainers lift the burden of merging translations back to upstream projects, a process known to be problematic and time-consuming.

Deliverables

Part 1: Web UI for translation statistics

  • Working instance of DL for Fedora modules

Make it publicly available at http://l10n.fedoraproject.org/

  • Configuration files listing Fedora modules, languages and contributors

Scripts to automate as much as possible the creation of the config files (might require hook-up with the Fedora Account System)

Part 2: PO-fetching tool

  • Python classes to checkout and checkin files from a remote SCM

First stage: CVS Many reusable classes out there

  • Enable use of this tool through the Web UI

Proposed schedule

  • May: Study DL structure
  • 28 May: Request from Fedora Infrastructure Project for hosting
  • 14 June: First prototype of DL for Fedora
  • 28 June: Full working Web UI for translation statistics
  • 9 July (mid-term): Deliver part 1 and overview of existing Python projects for remote SCM access
  • 16 July: Basic config files finalized for fetching tool
  • 30 July: Working prototype with proof-of-concept functionality
  • 13 August: Working tool for fedorahosted.org repository
  • 27 August: Hook-up with Web UI

Why me

My name is Dimitris Glezos and I'm currently pursuing a PhD at the University of Manchester. My research interests include: Service interoperability, Information integration, Fuzzy Semantics, Information Accessibility and Usability. Prior to my research degree, I've graduated from the Computer Engineering and Informatics department of the University of Patras, Greece.

I'm deeply involved in the Fedora Project: I am the maintainer of the Greek translations and hold a seat on the Fedora Documentation Steering Committee. Currently, one of my interests is opening up the translation process to enable more projects to be localized; it is in this context that I have proposed this project for GSoC. I also participate in various efforts to enhance the way contributions are accepted to the project, especially in the web front. I'm currently the maintainer of the system's default browser homepage.

I've been involved in web development for many years now. In my experience, I've led the establishment of a major web system for distant learning and communication of more than 23,000 students and professors for the Hellenic Open University, coordinated the development of a communications portal for my undergraduate department, maintained a company's product that managed complex multilingual websites.

If you'd like a curriculum vitae of mine, please don't hesitate to contact me .

References