Below we summarize some common problems and suggested ways to fix them, the impatient can jump to the status matrix.
Types of problems encountered
Missing glyphs in Unicode-encoded fonts
These can be easily fixed by referring to the Unicode standard. The glyphs relevant for Romanian are summarized here. Alternate forms (e.g. small caps) should work too, and kerning pairs should be added for the accented glyphs as well. Kerning can generally be copied from the non-accented glyphs because Romanian diacritics are centered on the optical axis of the glyph.
The Unicode map for Type 1 fonts needs to alias U+021A/B to U+0162/3
PostScript Type 1 (PS1) fonts don't have a native Unicode map. Contrary to popular belief, Type 1 fonts can store more than 256 glyphs in a pfb file, but these can only be addressed by AGL name. At most 256 glyphs can be accessed by a numeric index, for which various encodings schemes exist. A PS1 font can even specify its own 8-bit encoding scheme in the afm file; this is common practice for PS1 fonts targeting Central and Eastern Europe. The 8-bit encoding scheme is irrelevant however for Unicode applications. Unicode-enabled libraries, like freetype, define their own mapping from Unicode to AGL names, normally using the list published by Adobe.
Adobe once decided that "t with cedilla" is not used in any language, so the AGL name "Tcommaaccent", which is a glyph of T with a comma below, is actually mapped by Adobe to the Unicode code point U+0162, which is supposed to represent a t with cedilla. New OpenType fonts from Adobe also contain a glyph with the AGL name "uni021A", which is visually identical to identical to "Tcommaaccent". As you'd expect, "uni021A" is mapped to U+021A. Unfortunately, old PS1 fonts do not a have a "uni021A" in their pfb. Thus, using the Adobe-provide AGL to Unicode mapping for PS1 fonts, the code point U+021A remains unmapped. Fontconfig will therefore choose to borrow the glyph from a another font, even though the glyphs is present in the pfb. This problem is illustrated by the following OpenOffice screenshot. Practically all PostScript type 1 fonts that ship with Fedora suffer from this problem.
Microsoft's Uniscribe automatically handles this issue by remapping U+21A/B to U+162/3 when the former glyphs are missing. Unfortunately, the Pango/fonconfig/freetype stack doesn't do this, so most new Romanian documents cannot be displayed with Type 1 fonts properly.
Proposed solution: adopt Uniscribe solution; editing old PS1 fonts is a pointless exercise. With a corrected external mapping, all PS1 fonts, even the commercial ones, would instantly become usable in Fedora.
Missing glyph localization in OpenType fonts (
The Adobe industry standard seems to be that activating ROM/locl should map "s with cedilla" to "s with comma". Since in Adobe fonts U+162/3 is by default mapped to "t with comma", activating this optional mapping for s renders old, pre-Unicode 3.0 Romanian texts with comma below both s and t. Fonts that have a proper "t with cedilla" glyph should also remap it to the comma-below variant.
The OpenType SIL fonts take a slightly different approach, but work with Pango nonetheless. First, they have proper cedilla variants for both s and t. Second, they don't have a ROM/locl feature, but a ROM/ccmp feature which remaps both cedilla variants to comma-below counterparts. I'm not sure this approach is entirely correct because the OpenType spec on ccmp says that ccmp should not be language sensitive. YMMV, I'm no expert on this.
Font status matrix
For the sake of keeping the table compact, the table entries are abbreviated: y=yes, n=no, s=soon, a=approximating glyph (see the discussion above for the Adobe standard). You can click on the font name to be taken further down the page for details (if available). Fonts not listed here have very poor support, i.e. most glyphs are missing and usually they do not target Latin scripts. If you think a font should support Romanian, please add it to the table.
|Liberation||TTF||y||y||y||y||y||y||y||y||y||y||n||liberation-fonts||Ș & Ț OK in 1.04 (F9 updates)|
|Latin Modern||CFF||y||y||y||y||y||y||y||y||y||y||y||texlive-texmf-fonts||not in F9 fontconfig database|
|TeX Gyre||CFF||y||y||y||y||y||y||y||y||y||y||y||texlive-texmf-fonts||idem, and some legal trouble|
|Charis SIL||TTF||y||y||y||y||y||y||y||y||y||y||y||charis-fonts||Uses |
|Doulos SIL||TTF||y||y||y||y||y||y||y||y||y||y||y||doulos-fonts||Uses |
|Antykwa Toruńska||CFF||y||y||y||y||n||y||a||y||y||y||y||texlive-texmf-fonts||not in F9 fontconfig database|
|URW PostScript™||PS1||y||y||y||y||s||y||a||y||y||y||-||urw-fonts||ț fixed in freetype CVS #23940|
|Terminus||RAS||y||y||y||y||y||y||y||y||y||y||-||terminus-font-*||For console and X11|
|X.Org raster||RAS||y||y||y||y||y||y||y||y||y||y||-||xorg-x11-fonts-*||Except Charter and Courier 10|
Key for font type:
- TTF = old TrueType or TrueType flavored OpenType (which is backwards compabile with TrueType) in a .ttf
- CFF = OpenType CFF (Postscript Type 2) in .otf file
- PS1 = Postscript Type 1 in .afm/.pfb file pairs.
- RAS = Some raster format, e.g. PCF.
Font status details:
Version 2.26 of DejaVu fonts are not yet packaged for Fedora 9. You can get them from Rawhide by typing
yum --enablerepo=rawhide update dejavu-fonts.
The discussion on the
locl bug #455981 may be of interest to other font designers.
The comma below glyphs were added in version 1.04 of the fonts, which are included in F9 updates as of July 27. Older versions lacked both comma-below code points, and followed the Adobe convention, having a t with comma at U+162/3. In 1.04 a proper "t cedilla" is provided.
These are OpenType CFF conversions of the URW PostScript fonts. Only Bonum, Pagella, Schola and Termes are installed in Fedora 9.
- The (higher quality) CFF version of the fonts are not packaged. Bug #455995.
- Small caps versions are missing for S and T with comma below. Bug #2022566. Fixed upstream in version 3.0.
- Feature request #2022572 for locl. A patch is available on that page. A patched version of the CFF regular font is here.
X.Org raster fonts
The raster fonts that support iso10646 (Unicode) encoding are all OK for Romanian, except Bitstream Charter and Bitstream Courier 10 Pitch, which lack all accented glyps for Romanian. The matching Type-1 fonts form X.Org, Bitstream Charter and Courier 10 are broken in the same way. Here is the complete list of usable fonts for Romanian:
- courier (not 10 pitch!)
- new century scoolbook
These are conversions of Computer Modern fonts (from TeX in OT-1 encoding) poorly pretending to have a Unicode TTF map. Hopefully they'll get replaced by the Unicode CM font effort.
The default Linux console fonts also lack the U+219-B range, but few care about the console fonts these days...
Library and application support for OpenType
Most of this section applies to
locl support in general, but tests have only been carried out for Romanian. In the table "can disable" means that the user can turn off locl without changing the language.
|Software||can use||default||can disable||Notes|
|Pango||yes||on||no||See here for usage tips.|
|XeTeX||yes||on||yes||See fontspec docs p. 25-27 for usage tips.|
|OpenOffice.org||no||-||-||No OpenType CFF support either. #458476|
Some additional observations:
- Pango does not allow an application to set the OT features language independently from UI language; not for Romanian anyway. Currently, the only way to run, say,
ROM/loclenabled but the default English UI is:
LANG=ro_RO.UTF-8 LC_MESSAGES=C gedit. See Pango bug #442786.
- OpenOffice.org doesn't use Pango but it's own ICU renderer. It looks like a CTL (Complex Text Layout) module has to be written in order to enable use of OpenType font features for a specific script. It's not clear if the CTL can be per language, but a Latin CTL could detect Romanian.