From Fedora Project Wiki

< L10N

Revision as of 08:18, 2 June 2016 by Jaaf64 (talk | contribs)

Important.png
This page is a work in progress.
Note.png
This page describe and explains how to setup and use a python tool that aims at detecting typographic fault in translation files.

Description of the tool

User Interface

This tool is a python script that can be found at https://github.com/jaaf/po_purifier. It scan a directory for .po files. For each file, it checks translated messages against typographic rules that reside in a configuration file nammed typorules.py. Each time a typographic rule is not satisfied, the program stops and ask the user what to do. The figure 1 below shows how it looks like :

Figure 1: Typographic Fault Detected
  • The message to the user, that appears in English here, normally appears in the user's language, provided that the program has been localized. It has 2 parts:
    • The first part that tells the user a typo rule is infringed and that he has to decide for change or not (it is part of the program and has to be localized)
    • The typo rule itself (it belongs to the typorules.py file)
  • In this case the French typo rule requires a narrow no break space between a value and its unit and the location of the fault is shown with a green highlight.
  • To help the user, the message is shown twice. First with the various spaces colorized according to their type then with the typo fault highlighted.

The figure 2 below shows what happens after the user has accepted the change.

Figure 2: Typographic Correction Accepted
  • The message Change accepted is displayed.
  • The corrected message is displayed in blue color.
  • Then the program informs the user it has not changed some message because no typo faults were detected. It should do likewise till the next fault detection.

The figure 3 below shows a case where the user could use the c (for prior change) option. Indeed, we can see that an hyphen has been used in place of a semi-em dash. In French the spacing rules for hyphen and semi-em dash are different. An hyphen requires no space between the previous and the following word, while a semi-em dash requires a spaces for both. It appears that changing the hyphen with a semi-em dash is the best solution here.

Figure 3: The user should replace the hyphen with a semi-em dash

The following figures show how the process occurs.

Figure 4
Figure 5
Figure 6
Figure 7

Getting the tool

If you don't plan to improve the program, just go to The github page and use the green Clone or download button to download the po_purifier-master.zip file. Once unzipped, you should have a po_purifier-master folder. Inside this folder is a po_purifier folder that contains the program po_typo_purifier.py. Aside this po_purifier folder is a fr folder. This folder contains a list of .po file that are here for testing purposes. The best is to replace this folder with the folder that contains your translation files naming it after your country code.