From Fedora Project Wiki
Line 97: Line 97:


Explanation:
Explanation:
* the first term of the substitute command in sed is formed of 2 groups ([a-zA-Z0-9]) which represents any alphanumerical character or space (invisible here) and ([?!;:]) which is on among the 4 double punctutations. Parenthesis must be escaped.
* the first term of the substitute command in sed is formed of 2 groups ([a-zA-Z0-9]) which represents any alphanumerical character or space (invisible here) and ([?!;:]) which is 1 among the 4 double punctutations. Parenthesis must be escaped.
* the second term is group1, narrow no-break space (invisible here), and group 2
* the second term is group1, narrow no-break space (invisible here), and group 2



Revision as of 13:16, 8 May 2016

Updated for

This page has been updated for Fedora 23.

Context

While reviewing translation on Zanata, the reviewer may find some repetitive faults the translator has made. This kind of situation may result from various reasons such as:

  • Lack of attention by the translator to some aspects that are not always very visible while editing (e.g. double spaces)
  • Ignorance by the translator of a grammar or punctuation rule that leads to repetition of the error (e.g. in French double punctuation – :,;,!,? – should be preceded by a fine non breakable space contrarily to the English language)
  • Pure translation error for a repetitive word
  • Any other unknown rule

In such a situation, Zanata's search and replace functionality is not of great help. To be able to search and replace repetitive faults, the reviewer has to pull the translated files from Zanata, use some OS tools to do so and, eventually, push back the modified files to Zanata.

The present page explains the different stages to accomplish this efficiently.

Note.png
This page assumes the reviewer is using Fedora as OS – see the "Updated for" section above.

Setting up the Pull an Push Tool

Installing the Zanata Command Line Client

The necessary command line tool to pull and push from Zanata is the Zanata command line client (CLI) available in the zanata-client package. Please install it with:

su -c 'dnf -y install zanata-client' 

Configuring the Zanata Client

User Configuration

To allow the user to be authenticated by the Zanata server, the zanata-cli presents the user's credentials it finds in the ~/.config/zanata.ini file to the server. Thus, the first thing to do is to create this ~/.config/zanata.ini file and add the user's credential in it. Fortunately, the Zanata server provides a very convenient way of doing so.

To create you configuration file:

  1. Use your favorite text editor to create the ~/.config/zanata.ini file.
  2. Login into the Zanata server and navigate to the user's Settings page – click on the user's avatar in the top-right corner and, in the drop down menu that opens, chose Settings.
  3. Click the <> Client link.
  4. Ensure an API key is displayed. If not, click on the Generate API Key button to create one.
  5. Copy the content of the Configuration[zanata.ini] text-box.
  6. Past the copied lines into the ~/.ini/zanata.ini file and save it.

Project Version Configuration

Note.png
These steps must be repeated for each project-version before using any zanata-cli commands for the project-version.

Project configuration stores information specific to a project-version in the ~/<project dir path>/zanata.xml file – where <project dir path> should be replaced by the name of the project version's local directory . It helps the user to shorten the zanata-cli commands by providing default values for options to these commands.

Downloading the Configuration File

The ~/<project dir path>/zanata.xml can be customized after a base for it has been downloaded from the Zanata server. To download this configuration base, please, once connected to the server, do the followings:

  1. Navigate the project version page – the one that presents a list of languages for the project version.
  2. Click on the v … link just below the user's avatar in the top-right part of the page.
  3. Click on the Download Configuration File link and save the file into your ~/<project dir path> folder.

The downloaded file should look like the following:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<config xmlns="http://zanata.org/namespace/config/">
  <url>https://fedora.zanata.org/</url>
  <project>fedora-installation-guide</project>
  <project-version>f23</project-version>
  <project-type>podir</project-type>

</config>

Customizing the Configuration File

Locale

While using the Zanata CLI, if no further indication about the locale is provided with the command, the po and pot files of every Zanata recognized language will be downloaded. In order to systematically add a locale indication to the command, the zanata.xml file of the project version can be customized. To limit the download to your locale, please add the following lines into the <config> xml element in your zanata.xml file:

<locales>
  <locale><YOURLOCALE><locale>
<locales>

where <YOURLOCALE> has to be replaced with your language code (e.g. fr).

Proceeding to Massive Search and Replace in the Translated Files

Note.png
All the commands given from here on assume your current directory is <LOCALE_CODE>/ under the project version directory, where <LOCALE_CODE> has to be replaced with your language code (e.g. fr)
Warning.png
Be aware that all the commands below are rather dangerous, thus ensure you are in the right directory and the command doesn't include any typographic fault.

General Form of the Commands and Explanations

The general form of the search and replace commands is as follows:

grep -l '<STRING_TO_SEARCH_FOR>' ./* | xargs sed -i 's/<STRING_TO_SEARCH_FOR>/<SUBSTITUTE_STRING>/g'
  • '<STRING_TO_SEARCH_FOR>' is the pattern for grep
  • the -l argument to grep stands for --files-with-matches and makes the grep command print only names of files containing matches
  • the ./* argument to grep limits the search to the files under the current directory
  • the xargs command presents the grep output name of files as arguments to the sed command
  • the sed command proceeds to the replacement
  • the g option to the sed command tells that the replacement should applied globally on the lines

Searching Multiple Spaces and Replacing with a Unique Space

grep -l ' \+' ./* | xargs sed -i 's/ \+/ /g'

Ensuring an Unbreakable Space is Inserted Before Double Punctuations (French language)

In the French Language double punctuation i.e. ?,!,;,: require a narrow no-break space between them and the word that precedes. To ensure this rule is always applied, use the following command:

grep -l [^'\s][?!;:]' ./* |xargs sed -i 's/\([a-zA-Z0-9 ]\)\([?!;:]\)/\1 \2/g'

Explanation:

  • the first term of the substitute command in sed is formed of 2 groups ([a-zA-Z0-9]) which represents any alphanumerical character or space (invisible here) and ([?!;:]) which is 1 among the 4 double punctutations. Parenthesis must be escaped.
  • the second term is group1, narrow no-break space (invisible here), and group 2
Note.png
If your keyboard doesn't permit you to enter the narrow no-break space, use the ctrl + Shift + u key combination. This should make a u be displayed in your editor, then enter the unicode for the narrow no-break space which is: 202F and release all the keys
Note.png
Generally, very few fonts offer the narrow no-break space (202F) thus it may be replaced by the normal no-break space (00A0).