Archive:L10N/Tools/DamnedLiesPlans

= Damned Lies Plans =

We have been using Damned Lies in Fedora for quite some while now, and we have also got Transifex for submitting translations. These systems are totally separate although they share some of the same concepts, such as modules and branches. Damned Lies is also very gnome-specific (intltool and gnome-doc-utils), and doesn't support much of the Fedora way of doing things. On this page we document the changes we are doing to Damned Lies and Transifex.

[[TableOfContents(3)]

= Pointers =

Damned Lies
cvs co web cd web/flpweb
 * Gnome instance of Damned Lies
 * Gnome bugzilla for Damned Lies
 * Upstream repository for damned lies
 * Fedora instance repository for damned lies
 * export CVSROOT=:ext:FEDORA_USERNAME@cvs.fedoraproject.org:/cvs/l10n
 * export CVSROOT=:ext:FEDORA_USERNAME@cvs.fedoraproject.org:/cvs/l10n

Transifex
hg clone http://hg.fedorahosted.org/hg/transifex
 * https://fedorahosted.org/transifex/

= Proposed Refactoring and Changes =

Resource Builders
update-stats.py is a script that is responsible for updating the statistics for each module in Damned Lies. It has the following main functions: 1. Check out or Update each branch of a module 2. Generate POT files for each branch of a module 3. Merge existing PO files with newly generated POT files 4. Generate statistics for each PO file and update statistics database 5. Provide info,warning,error messages on the above processes

update-stats supports two different 'build systems', namely intltool and gnome-doc-utils. In Fedora, we don't use either of these much, but rely on other ways of building PO files. We wish to introduce the concept of a generic } that is responsible for providing the PO/POT resources. We will refactor  and extract the functionality to create a   and a , and also create builders for Fedora. These will be independent of the underlying version control system (see section on repository access below).

Rather than dealing directly with PO(T) files, we also introduce the concept of a  and , these are roughly like PO/POT files, but we foresee having to localize other resource-types in the future, for example images relating to documentation. The type of a  also determine how statistics are generated.

A  is instantiated on a specific   of a. The purpose of a Builder is to: 1. Identify and provide the s for a module. 1. Identify and provide the the s for each 1. Provide diagnostics on the,   and   level (information, warning and error messages) 1. Provide Builder/Resource specific functionality, for example, a publican builder can build the translated documentation and provide a link to this in the interface.

Builders are linked to the existing use of the  and   elements within a module, where specific elements can trigger certain builders:

 ... (triggers IntltoolBuilder) ... (triggers GnomeDocBuilder)  ...  (triggers PublicanBuilder)

BuilderBase

 * Abstract base class used inherited by specific builders.

IntltoolBuilder

 * Provides the functionality of the existing Damned Lies  builder
 * Uses intltool to manage PO(T) files

GnomeDocBuilder

 * Provides the functionality of the existing Damned Lies  builder

PublicanBuilder

 * Builds publican Translations
 * multiple resources, 1 POT pr XML file

SimplePoBuilder

 * Uses POT provided in the repository

TODO:
 * E.g. publican creates a directory for each language. How is this handled by builders when we want to add a language. Should we use the Repository access methods to accomplish this (create the file structure and commit)?

Resources and Templates
In Damned Lies, each  or   can only be associated with one POT file. To support e.g. Publican documentation, we wish to introduce the concept of Resources, where each Builder has a set of Resources associated with it.

A  has the following fields associated with it:
 * name (e.g. common-entities, for display purposes)
 * type (e.g. POT, image, ...)
 * filename (common-entities.pot)
 * path relative to the module root (./POT/)
 * List of locale-specific s
 * Any builder-specific information

In some projects, the templates are not stored in the sandbox (e.g. for intltool), but generated dynamically.

A  has the following fields associated with it:
 * foreign key
 * locale (e.g. pt-BR)
 * Any builder-specific information

PoResourceStatistics
 * resource_id
 * timestamp
 * total_count
 * translated_count
 * untranslated_count
 * fuzzy_count

TODO: - History - Statistics History (Graphs showing the evolution of a resource) - Last Translator

Repository Access
Damned Lies uses a class  for accessing version control repositories such as Cvs, Svn and Git. Fedora has split this up into,  , etc. We wish to further clean up the VCS access functions and create a common way of accessing and manipulating various version control system sandboxes.

A Repository represents a version control repository such as Svn, Cvs and Git. A Module represents a specific module in a repository A Stream represents a specific branch/tag/path within a repository A Sandbox represents a local checked out sandbox relating to a Stream.

Proposed Design:


 * has many es linked by branch_id
 * Convenience methods for doing operations on each branch through
 * Simply removes the sandbox dir, e.g.
 * Removes any local modifications to version-controlled files
 * Restores any deleted files
 * Removes any files not version controlled
 * copy  and   from kdesdk to   directory
 * file-based Locking to prevent two processes modifying the repo at the same time..
 * release lock on repository
 * Open questions:
 * Support for working on a specific TAG or revision?
 * Systems have different concepts of this...
 * or  to evaluate which resources has changed...
 * File/Resource-level operations?
 * Allow commit of a single file?
 * How to vcs-agnosticly create a file/directory?
 * class?
 * from, call get_resource(relative_path), and then use methods to work on a specific File/Resource?
 * Any operation on a repository should leave the sandbox in a  (or  ) state.
 * call clean at the end of each task
 * Open questions:
 * Support for working on a specific TAG or revision?
 * Systems have different concepts of this...
 * or  to evaluate which resources has changed...
 * File/Resource-level operations?
 * Allow commit of a single file?
 * How to vcs-agnosticly create a file/directory?
 * class?
 * from, call get_resource(relative_path), and then use methods to work on a specific File/Resource?
 * Any operation on a repository should leave the sandbox in a  (or  ) state.
 * call clean at the end of each task
 * Any operation on a repository should leave the sandbox in a  (or  ) state.
 * call clean at the end of each task

Data model
Damned Lies has four key concepts:
 * Teams
 * People
 * Release Sets
 * Modules

The data for these concepts are all stored in XML files and parsed at run-time.

For Transifex, we aim to move this model to SQLObject, but maintain compatibility with the XML model through import/export tools.

Even when we move the model to SQLObject, it is possible to manage the data using XML, simply by re-importing the data when the time-stamp of the XML file has changed, or manually using an update-script.

= Moving functionality to Transifex =

Refactoring needed in Transifex

 * Module, merge with DL data model
 * Use common vcs access API