Dealing with Publican po files: Difference between revisions

Latest revision as of 15:47, 21 November 2009

Publican produces a pot file per xml file, rather than the single pot file per document that the old docs tools produced. Merging those pots with msgcat (msgcat -o output.pot *.pot) is no particular problem, but when l10n comes back with the results, you now are faced with the problem of converting one po file per language back into one po file per language per xml file.

It turns out, this is not terribly difficult to deal with. First, create a file in the po directory, LINGUAS, that lists the languages you want to process. Then:

#!/bin/bash
#
# Turn all single PO into separate PO for Publican
# * Relies on po/LINGUAS to be correct!
# * Make sure you're in the top of the module with the 'po/' folder.
#
for LANG in cat po/LINGUAS; do
    for POTFILE in pot/*.pot; do
        msgmerge po/${LANG}.po ${POTFILE} | msgattrib --no-obsolete \
                 > ${LANG}/$(basename ${POTFILE} .pot).po
    done
done

Finally, rename the folders to the correct ISO compliant names.

The above was Paul's suggestion. However, on reflection, since we need to do a lot of renaming as well, I wonder if we shouldn't do away with po/LINGUAS altogether and do something like:

#!/bin/sh
#
# Turn all single PO into separate PO for Publican
# * Make sure you're in the top of the module with the 'po/' folder.
#
# YOU MUST ADJUST THE FOLLOWING TWO ARRAYS TO REFLECT THE LANGUAGES YOU
# HAVE, AND DON'T FORGET THE COUNT
#
TFXLAN=(as    bn_IN ca    cs    da    de    el    es    fi    fr    gu    he    hi    hr    hu    id    it    ja    kn    ko    ml    mr    ms    nb    nl    or    pa    pl    pt_Br pt    ru    sk    sr_Latn    sr    sv    ta    te    uk    zh_CN zh_TW)
PUBLAN=(as-IN bn-IN ca-ES cs-CZ da-DK de-DE el-GR es-ES fi-FI fr-FR gu-IN he-IL hi-IN hr-HR hu-HU id-ID it-IT ja-JP kn-IN ko-KR ml-IN mr-IN ms-MY nb-NO nl-NL or-IN pa-IN pl-PL pt-BR pt-PT ru-RU sk-SK sr-Latn-RS sr-RS sv-SE ta-IN te-IN uk-UA zh-CN zh-TW)
for NUM in {0..39} ; do
    mv po/${TFXLAN[${NUM}]}.po po/${PUBLAN[${NUM}]}.po
    mkdir ${PUBLAN[${NUM}]}
    for POTFILE in pot/*.pot; do
        msgmerge po/${PUBLAN[${NUM}]}.po ${POTFILE} | msgattrib --no-obsolete  \
                 > ${PUBLAN[${NUM}]}/$(basename ${POTFILE} .pot).po
    done
  done

This way we wouldn't forget to do all that renaming, and the renaming would be guaranteed consistent with the languages.

At this point, you probably want to copy the text in PUBLAN between ( and ) and paste it into your Makefile after OTHER_LANGS = .

The other question still to be answered is whether PUBLAN should always contain a location code. Publican is perfectly happy without it (well, almost happy) and some on fedora-docs-list have made a case for leaving it off in some cases.

@@ Line 1: / Line 1: @@
-Publican produces a <tt>pot</tt> file per <tt>xml</tt> file, rather than the single <tt>pot</tt> file per document that the old docs tools produced.  Merging those pots with <tt>msgcat</tt> is no particular problem, but when l10n comes back with the results, you now are faced with the problem of converting one <tt>po</tt> file per language back into one <tt>po</tt> file per language per xml file.
+Publican produces a <tt>pot</tt> file per <tt>xml</tt> file, rather than the single <tt>pot</tt> file per document that the old docs tools produced.  Merging those pots with <tt>msgcat</tt> (<tt>msgcat -o output.pot *.pot</tt>) is no particular problem, but when l10n comes back with the results, you now are faced with the problem of converting one <tt>po</tt> file per language back into one <tt>po</tt> file per language per xml file.
 It turns out, this is not terribly difficult to deal with.  First, create a file in the po directory, <tt>LINGUAS</tt>, that lists the languages you want to process.  Then:
@@ Line 18: / Line 18: @@
 Finally, rename the folders to the correct ISO compliant names.
+The above was Paul's suggestion.  However, on reflection, since we need to do a lot of renaming as well, I wonder if we shouldn't do away with <tt>po/LINGUAS</tt> altogether and do something like:
-[[Category:Docs_Project]]
+ #!/bin/sh
-[[Category:Docs_Project_tools]]
+ #
+ # Turn all single PO into separate PO for Publican
+ # * Make sure you're in the top of the module with the 'po/' folder.
+ #
+ # YOU MUST ADJUST THE FOLLOWING TWO ARRAYS TO REFLECT THE LANGUAGES YOU
+ # HAVE, AND DON'T FORGET THE COUNT
+ #
+ TFXLAN=(as    bn_IN ca    cs    da    de    el    es    fi    fr    gu    he    hi    hr    hu    id    it    ja    kn    ko    ml    mr    ms    nb    nl    or    pa    pl    pt_Br pt    ru    sk    sr_Latn    sr    sv    ta    te    uk    zh_CN zh_TW)
+ PUBLAN=(as-IN bn-IN ca-ES cs-CZ da-DK de-DE el-GR es-ES fi-FI fr-FR gu-IN he-IL hi-IN hr-HR hu-HU id-ID it-IT ja-JP kn-IN ko-KR ml-IN mr-IN ms-MY nb-NO nl-NL or-IN pa-IN pl-PL pt-BR pt-PT ru-RU sk-SK sr-Latn-RS sr-RS sv-SE ta-IN te-IN uk-UA zh-CN zh-TW)
+ for NUM in {0..39} ; do
+     mv po/${TFXLAN[${NUM}]}.po po/${PUBLAN[${NUM}]}.po
+     mkdir ${PUBLAN[${NUM}]}
+     for POTFILE in pot/*.pot; do
+         msgmerge po/${PUBLAN[${NUM}]}.po ${POTFILE} | msgattrib --no-obsolete  \
+                  > ${PUBLAN[${NUM}]}/$(basename ${POTFILE} .pot).po
+     done
+   done
+This way we wouldn't forget to do all that renaming, and the renaming would be guaranteed consistent with the languages.
+At this point, you probably want to copy the text in <tt>PUBLAN</tt> between ( and ) and paste it into your Makefile after <tt>OTHER_LANGS = </tt>.
+The other question still to be answered is whether <tt>PUBLAN</tt> should always contain a location code.  Publican is perfectly happy without it (well, almost happy) and some on fedora-docs-list have made a case for leaving it off in some cases.
+[[Category:Documentation tools]]
+[[Category:Localization]]

Search

Dealing with Publican po files: Difference between revisions

Latest revision as of 15:47, 21 November 2009