From Fedora Project Wiki

< L10N‎ | Tasks

Revision as of 13:30, 28 July 2008 by Vga (talk | contribs)

A page of the Fonts Special Interest Group

Note.png
Background information
See the Wikipedia page on the Romanian alphabet.
Idea.png
Tip of the page
A very useful program is fontmatrix. Detailed font info and one click font installs. Yum it!

Fonts

Below we summarize some common problems and suggested ways to fix them, the impatient can jump to the status matrix.

Types of problems encountered

Missing glyphs in Unicode-encoded fonts

These can be easily fixed by referring to the Unicode standard. The glyphs relevant for Romanian are summarized here. Alternate forms (e.g. small caps) should work too, and kerning pairs should be added for the accented glyphs as well. Kerning can generally be copied from the non-accented glyphs because Romanian diacritics are centered on the optical axis of the glyph.

The Unicode map for Type 1 fonts needs to alias U+021A/B to U+0162/3

PostScript Type 1 (PS1) fonts don't have a native Unicode map. Contrary to popular belief, Type 1 fonts can store more than 256 glyphs in a pfb file, but these can only be addressed by AGL name. At most 256 glyphs can be accessed by a numeric index, for which various encodings schemes exist. A PS1 font can even specify its own 8-bit encoding scheme in the afm file; this is common practice for PS1 fonts targeting Central and Eastern Europe. The 8-bit encoding scheme is irrelevant however for Unicode applications. Unicode-enabled libraries, like freetype, define their own mapping from Unicode to AGL names, normally using the list published by Adobe.

Adobe once decided that "t with cedilla" is not used in any language, so the AGL name "Tcommaaccent", which is a glyph of T with a comma below, is actually mapped by Adobe to the Unicode code point U+0162, which is supposed to represent a t with cedilla. New OpenType fonts from Adobe also contain a glyph with the AGL name "uni021A", which is visually identical to identical to "Tcommaaccent". As you'd expect, "uni021A" is mapped to U+021A. Unfortunately, old PS1 fonts do not a have a "uni021A" in their pfb. Thus, using the Adobe-provide AGL to Unicode mapping for PS1 fonts, the code point U+021A remains unmapped. Fontconfig will therefore choose to borrow the glyph from a another font, even though the glyphs is present in the pfb. This problem is illustrated by the following OpenOffice screenshot. Practically all PostScript type 1 fonts that ship with Fedora suffer from this problem.

Microsoft's Uniscribe automatically handles this issue by remapping U+21A/B to U+162/3 when the former glyphs are missing. Unfortunately, the Pango/fonconfig/freetype stack doesn't do this, so most new Romanian documents cannot be displayed with Type 1 fonts properly.

Proposed solution: adopt Uniscribe solution; editing old PS1 fonts is a pointless exercise. With a corrected external mapping, all PS1 fonts, even the commercial ones, would instantly become usable in Fedora.

Missing glyph localization in OpenType fonts (GSUB/latn/ROM/{locl,ccmp})

The Adobe industry standard seems to be that activating ROM/locl should map "s with cedilla" to "s with comma". Since in Adobe fonts U+162/3 is by default mapped to "t with comma", activating this optional mapping for s renders old, pre-Unicode 3.0 Romanian texts with comma below both s and t. Fonts that have a proper "t with cedilla" glyph should also remap it to the comma-below variant.

The OpenType SIL fonts take a slightly different approach, but work with Pango nonetheless. First, they have proper cedilla variants for both s and t. Second, they don't have a ROM/locl feature, but a ROM/ccmp feature which remaps both cedilla variants to comma-below counterparts. I'm not sure this approach is entirely correct because the OpenType spec on ccmp says that ccmp should not be language sensitive. YMMV, I'm no expert on this.

Font status matrix

For the sake of keeping the table compact, the table entries are abbreviated: y=yes, n=no, s=soon, a=approximating glyph (see the discussion above for the Adobe standard). You can click on the font name to be taken further down the page for details (if available). Fonts not listed here have very poor support, i.e. most glyphs are missing and usually they do not target Latin scripts. If you think a font should support Romanian, please add it to the table.

Support for Romanian in Fedora fonts
Font name type ă î â ș ț ş ţ „” «» locl Package Notes
DejaVu TTF y y y y y y y y y y y dejavu-fonts locl OK in 2.26 except ExtraLight
Liberation TTF y y y y y y y y y y n liberation-fonts Ș & Ț OK in 1.04 (F9 updates)
Latin Modern CFF y y y y y y y y y y y texlive-texmf-fonts not in F9 fontconfig database
TeX Gyre CFF y y y y y y y y y y y texlive-texmf-fonts idem, and some legal trouble
Linux Libertine TTF y y y y y y y y y y s linux-libertine-fonts Small caps Ș & Ț fixed in 3.0
Charis SIL TTF y y y y y y y y y y y charis-fonts Uses ccmp for locl
Doulos SIL TTF y y y y y y y y y y y doulos-fonts Uses ccmp for locl
Gentium SIL TTF y y y y y y y y y y n gentium-fonts -
Antykwa Torunska CFF y y y y n y a y y y y texlive-texmf-fonts not in F9 fontconfig database
STIX TTF y y y n n y y y y y n stix-fonts -
MathML TTF n n n n n n n n n n n mathml-fonts -
URW PostScript™ PS1 y y y y n y a y y y - urw-fonts freetype bug #23940
Terminus RAS y y y y y y y y y y - terminus-font-* For console and X11
X.Org raster RAS y y y y y y y y y y - xorg-x11-fonts-* Except Charter and Courier 10

Key for font type:

  • TTF = old TrueType or TrueType flavored OpenType (which is backwards compabile with TrueType) in a .ttf
  • CFF = OpenType CFF (Postscript Type 2) in .otf file
  • PS1 = Postscript Type 1 in .afm/.pfb file pairs.
  • RAS = Some raster format, e.g. PCF.

Font status details:

DejaVu

Version 2.26 of DejaVu fonts are not yet packaged for Fedora 9. You can get them from Rawhide by typing yum --enablerepo=rawhide update dejavu-fonts.

The discussion on the locl bug #455981 may be of interest to other font designers.

Liberation

The comma below glyphs were added in version 1.04 of the fonts, which are included in F9 updates as of July 27. Older versions lacked both comma-below code points, and followed the Adobe convention, having a t with comma at U+162/3. In 1.04 a proper "t cedilla" is provided.

TeX Gyre

These are OpenType CFF conversions of the URW PostScript fonts. Only Bonum, Pagella, Schola and Termes are installed in Fedora 9.

Linux Libertine

  • The (higher quality) CFF version of the fonts are not packaged. Bug #455995.
  • Small caps versions are missing for S and T with comma below. Bug #2022566. Fixed upstream in version 3.0.
  • Feature request #2022572 for locl. A patch is available on that page. A patched version of the CFF regular font is here.

X.Org raster fonts

The raster fonts that support iso10646 (Unicode) encoding are all OK for Romanian, except Bitstream Charter and Bitstream Courier 10 Pitch, which lack all accented glyps for Romanian. The matching Type-1 fonts form X.Org, Bitstream Charter and Courier 10 are broken in the same way. Here is the complete list of usable fonts for Romanian:

  • clean
  • clearlyu
  • courier (not 10 pitch!)
  • fixed
  • helvetica
  • lucida
  • lucidabright
  • lucidatypewriter
  • new century scoolbook
  • times
  • utopia

MathML

These are conversions of Computer Modern fonts (from TeX in OT-1 encoding) poorly pretending to have a Unicode TTF map. Hopefully they'll get replaced by the Unicode CM font effort.

Console Fonts

The default Linux console fonts also lack the U+219-B range, but few care about the console fonts these days...

Library and application support for OpenType ROM/locl

Most of this section applies to locl support in general, but tests have only been carried out for Romanian. In the table "can disable" means that the user can turn off locl without changing the language.

Support for ROM/locl in Fedora software
Software can use default can disable Notes
Pango yes on no See here for usage tips.
XeTeX yes on yes See fontspec docs p. 25-27 for usage tips.
OpenOffice.org no - - No OpenType CFF support either.

Some additional observations:

  • Pango does not allow an application to set the OT features language independently from UI language; not for Romanian anyway. Currently, the only way to run, say, gedit with ROM/locl enabled but the default English UI is: LANG=ro_RO.UTF-8 LC_MESSAGES=C gedit. See Pango bug #442786.
  • OpenOffice.org doesn't use Pango but it's own ICU renderer. It looks like a CTL (Complex Text Layout) module has to be written in order to enable use of OpenType font features for a specific script. It's not clear if the CTL can be per language, but a Latin CTL could detect Romanian.



Idea.png
Fonts in Fedora
The Fonts SIG takes loving care of Fedora fonts. Please join this special interest group if you are interested in creating, improving, packaging, or just suggesting a font. Any help will be appreciated.