Archive:L10N/Tools/DamnedLiesPlans

From FedoraProject

Jump to: navigation, search
Cog.png
It has been requested that this page be deleted.
Outdated content (we moved entirely to Transifex)

Contents

Damned Lies Plans

We have been using Damned Lies in Fedora for quite some while now, and we have also got Transifex for submitting translations. These systems are totally separate although they share some of the same concepts, such as modules and branches. Damned Lies is also very gnome-specific (intltool and gnome-doc-utils), and doesn't support much of the Fedora way of doing things. On this page we document the changes we are doing to Damned Lies and Transifex.

[[TableOfContents(3)]

Pointers

Damned Lies

cvs co web cd web/flpweb

Transifex

hg clone http://hg.fedorahosted.org/hg/transifex

Proposed Refactoring and Changes

Resource Builders

update-stats.py is a script that is responsible for updating the statistics for each module in Damned Lies. It has the following main functions: 1. Check out or Update each branch of a module 2. Generate POT files for each branch of a module 3. Merge existing PO files with newly generated POT files 4. Generate statistics for each PO file and update statistics database 5. Provide info,warning,error messages on the above processes

update-stats supports two different 'build systems', namely intltool and gnome-doc-utils. In Fedora, we don't use either of these much, but rely on other ways of building PO files. We wish to introduce the concept of a generic Builder} that is responsible for providing the PO/POT resources. We will refactor update-stats.py and extract the functionality to create a IntlToolBuilder and a GnomeDocBuilder, and also create builders for Fedora (PublicanBuilder, SimplePoBuilder). These will be independent of the underlying version control system (see section on repository access below).

Rather than dealing directly with PO(T) files, we also introduce the concept of a Resource and ResourceTemplate, these are roughly like PO/POT files, but we foresee having to localize other resource-types in the future, for example images relating to documentation. The type of a Resource also determine how statistics are generated.

A Builder is instantiated on a specific Branch of a Module. The purpose of a Builder is to: 1. Identify and provide the ResourceTemplates for a module. 1. Identify and provide the the Resources for each ResourceTemplate 1. Provide diagnostics on the Module, ResourceTemplate and Resource level (information, warning and error messages) 1. Provide Builder/Resource specific functionality, for example, a publican builder can build the translated documentation and provide a link to this in the interface.

Builders are linked to the existing use of the <domain> and <document> elements within a module, where specific elements can trigger certain builders:

<branch id='trunk'>
<domain> ... </domain> (triggers IntltoolBuilder)
<document> ... </document> (triggers GnomeDocBuilder)
<publican-document> ... </publican-document> (triggers PublicanBuilder)
</branch>

BuilderBase

  • Abstract base class used inherited by specific builders.

IntltoolBuilder

  • Provides the functionality of the existing Damned Lies <domain> builder
  • Uses intltool to manage PO(T) files

GnomeDocBuilder

  • Provides the functionality of the existing Damned Lies <document> builder

PublicanBuilder

  • Builds publican Translations
  • multiple resources, 1 POT pr XML file

SimplePoBuilder

  • Uses POT provided in the repository

TODO:

  • E.g. publican creates a directory for each language. How is this handled by builders when we want to add a language. Should we use the Repository access methods to accomplish this (create the file structure and commit)?

Resources and Templates

In Damned Lies, each <domain> or <document> can only be associated with one POT file. To support e.g. Publican documentation, we wish to introduce the concept of Resources, where each Builder has a set of Resources associated with it.

A ResourceTemplate has the following fields associated with it:

  • name (e.g. common-entities, for display purposes)
  • type (e.g. POT, image, ...)
  • filename (common-entities.pot)
  • path relative to the module root (./POT/)
  • List of locale-specific Resources
  • Any builder-specific information

In some projects, the templates are not stored in the sandbox (e.g. for intltool), but generated dynamically.

A Resource has the following fields associated with it:

  • ResourceTemplate foreign key
  • locale (e.g. pt-BR)
  • Any builder-specific information

PoResourceStatistics

  • resource_id
  • timestamp
  • total_count
  • translated_count
  • untranslated_count
  • fuzzy_count

TODO: - History - Statistics History (Graphs showing the evolution of a resource) - Last Translator

Repository Access

Damned Lies uses a class ScmModule for accessing version control repositories such as Cvs, Svn and Git. Fedora has split this up into CvsModule, SvnModule, etc. We wish to further clean up the VCS access functions and create a common way of accessing and manipulating various version control system sandboxes.

A Repository represents a version control repository such as Svn, Cvs and Git. A Module represents a specific module in a repository A Stream represents a specific branch/tag/path within a repository A Sandbox represents a local checked out sandbox relating to a Stream.

Proposed Design:

  • ScmModule
  • ScmSandbox get_sandbox(branch_id)
  • has many ScmSandBoxes linked by branch_id
  • void clear(), clean(), update() ...
  • Convenience methods for doing operations on each branch through ScmSandbox
  • ScmSandbox
  • __init__(self, scmmodule, branch_id)
  • void checkout()
  • void update()
  • void clear()
  • Simply removes the sandbox dir, e.g. rm -Rf /vcs/git/anaconda.tip
  • void clean()
  • Removes any local modifications to version-controlled files
  • Restores any deleted files
  • Removes any files not version controlled
  • copy cvs-clean and svn-clean from kdesdk to ./bin/ directory
  • void lock()
  • file-based Locking to prevent two processes modifying the repo at the same time..
  • void release_lock()
  • release lock on repository
  • Open questions:
  • Support for working on a specific TAG or revision?
  • Systems have different concepts of this...
  • commit(logMessage)?
  • diff(relative_path) or status() to evaluate which resources has changed...
  • File/Resource-level operations?
  • Allow commit of a single file?
  • How to vcs-agnosticly create a file/directory?
  • track_resource(relative_path)
  • ScmResource class?
  • from ScmSandbox, call get_resource(relative_path), and then use methods to work on a specific File/Resource?
  • Any operation on a repository should leave the sandbox in a clean() (or clear()) state.
  • call clean at the end of each task

Data model

Damned Lies has four key concepts:

  • Teams
  • People
  • Release Sets
  • Modules

The data for these concepts are all stored in XML files and parsed at run-time.

For Transifex, we aim to move this model to SQLObject, but maintain compatibility with the XML model through import/export tools.

Even when we move the model to SQLObject, it is possible to manage the data using XML, simply by re-importing the data when the time-stamp of the XML file has changed, or manually using an update-script.

Moving functionality to Transifex

Refactoring needed in Transifex

  • Module, merge with DL data model
  • Use common vcs access API