From Fedora Project Wiki

Revision as of 06:40, 28 April 2010 by Nkumar (talk | contribs)

Author: Naveen Kumar

Internationalization (i18n) refers to an application's/package's support for multiple languages. This support comes from a kind of generalization on part of application/package that helps Localize it in different languages.

Localization or (l10n) here refers to the process of adapting, translating or customising that application/package for a particular locale.

Locale is a term used to define a set of information corresponding to a given language & country. A locale information is used by a software application (or operating system) to exhibit a localised behaviour. This localised behaviour is in the form of displaying Application's/package's text in local language or other things pertaining to a locale convention such as localized date, currency format, color conventions, etc.

In this tutorial we will cover i18n & l10n only with respect to text i18n/l10n.

Gettext framework is one such approach to do text i18n. It refers to a collection of tools which are used to internationalize and localize an application/package. Apart from internationalization of applications/packages these tools assist in translating the strings on menus, messages boxes or icons on the applications in the language that the user is interested in.

For a detailed information on text internationalization you can refer to Gettext manual

We assume that you use emacs. Do the following:

  • Run emacs
  • type ALT+x
  • type ansi-term in the lower window
  • press return key twice one after the other


Development Environment

To internationalize an application we need a set of development tools. This is a one-time-only setup, installed by running those commands from a system administration (root) account:

yum install  @development-tools
yum groupinstall  <langname>-support

The <langname> above refers to the name of your language. For hindi I would write something like:

yum groupinstall  hindi-support

Hello World

Let us write our first Hello World program:

#include<stdio.h>

int main()
{
    printf("Hello World\n");
    return 0;
}

The output generated by this program is entirely in English. Now in order to make is localizable in different languages, we need to generalize/internationalize it in some way, such that, when a user selects a particular locale, the application switches its strings/output to the language described by that locale. For example if I select a locale hi_IN.UTF-8 (hi->Hindi; IN->India; encoding->UTF-8), the output/strings of this application should be displayed in Hindi.


Internationalizing Hello World

Now in order to generalize/internationalize this "Hello World" program one needs to ensure that whe

#include<libintl.h>
#include<locale.h>
#include<stdio.h>

#define _(String) gettext (String)

int main()
{
    setlocale(LC_ALL,"");
    bindtextdomain("helloworld","/usr/share/locale");
    textdomain("helloworld");
    printf(_("Hello World\n"));
    return 0;
}

Create a new directory named po/

mkdir po/hi/


Extract the strings in a POT (helloworld.pot) file using the following command

xgettext -d helloworld -o po/helloworld.pot -k_ -s helloworld.c

A new file helloworld.pot will be created inside directory po/

PO(T) files

helloworld.pot

# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2010-04-27 17:42+0530\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"

#: helloworld.c:13
#, c-format
msgid "Hello World\n"
msgstr ""

create a directory with the name of your language. This language name should be probably a 2-digit/3-digit code listed for your language in ISO 639-1. Use http://www.loc.gov/standards/iso639-2/php/code_list.php for reference. A directory with the same name should also be listed at /usr/share/locale. For hindi I would do this:

mkdir hi/
cp helloworld.pot hi/helloworld.po

Open an Editor of your choice and translate your file in the following manner:

# Hello World Localization.
# Copyright (C) 2010 Naveen Kumar
# This file is distributed under the same license as the PACKAGE package.
# Naveen Kumar <nkumar@redhat.com>, 2010.
#
msgid ""
msgstr ""
"Project-Id-Version: helloworld 1.0\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2010-04-27 18:31+0530\n"
"PO-Revision-Date: 2010-04-27 18:53+0530\n"
"Last-Translator: Naveen Kumar <nkumar@redhat.com>\n"
"Language-Team: Hindi <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: helloworld.c:13
#, c-format
msgid "Hello World"
msgstr "नमस्कार दुनिया\n"

In root mode copy the po file to your <lang>/LC_MESSAGES directory at /usr/share/locale/hi/LC_MESSAGES. For Hindi I would do something like this:

cp helloworld.mo /usr/share/locale/hi/LC_MESSAGES/

Compile your C file

gcc -o helloworld helloworld.c

Run something like

LANG=hi_IN
./helloworld

You should see message (Hello World) appear in your local language:

[nkumar@localhost]$ LANG=hi_IN
[nkumar@localhost]$ ./helloworld 
नमस्कार दुनिया

.....editing on-----