From Fedora Project Wiki
No edit summary
No edit summary
Line 319: Line 319:
| dgo || Dogri              ||              ||            ||          || [http://www.ciil.org/ Central Institute for Indian Languages]
| dgo || Dogri              ||              ||            ||          || [http://www.ciil.org/ Central Institute for Indian Languages]
|-
|-
| dsb || Lower Sorbian      || [http://sourceforge.net/projects/dsb-spell/ available]  || || ||  
| dsb || Lower Sorbian      || [https://bugzilla.redhat.com/show_bug.cgi?id=528889 rhbz#528889]  || || ||  
|-
|-
| ee  || Ewe                ||              ||            ||          || [http://www.eweland.com/ online] dictionary
| ee  || Ewe                ||              ||            ||          || [http://www.eweland.com/ online] dictionary
Line 327: Line 327:
| fj  || Fijian            || hunspell-fj  ||            ||          ||
| fj  || Fijian            || hunspell-fj  ||            ||          ||
|-
|-
| grc || Ancient Greek      || [http://www.himeros.eu/ available] || [http://www.himeros.eu/ available] || ||
| grc || Ancient Greek      || [https://bugzilla.redhat.com/show_bug.cgi?id=528893 rhbz#528893] || [https://bugzilla.redhat.com/show_bug.cgi?id=528904 rhbz#528904] || ||
|-
|-
| gsc || Gascon            || [http://wiki.services.openoffice.org/wiki/Dictionaries#Gascon_.28France.29 Non-Commercial] BY-NC-ND license  || || ||
| gsc || Gascon            || [http://wiki.services.openoffice.org/wiki/Dictionaries#Gascon_.28France.29 Non-Commercial] BY-NC-ND license  || || ||

Revision as of 09:51, 14 October 2009

Linguistic Components

1. Language Support Matrix (glibc upwards)

Language Code Language hunspell hyphen mythes notes
aa Afar afarfriends.org hosted ALSEC report.
af Afrikaans hunspell-af hyphen-af
am Amharic hunspell-am
an Aragonese www.iea.es, see Spain: Lexicography In Iberian Languages
ar Arabic hunspell-ar experimental thesaurus
as Assamese hunspell-as hyphen-as xobdo is another potential source, possibly even for a thesaurus, but this isn't an option apparently at the moment.
ast Asturian crubadan corpus building www.academiadelallingua.com, see Spain: Lexicography In Iberian Languages. Asturian translation team
az Azeri (Latin) hunspell-az
be Belarusian hunspell-be hyphen-be
ber Amazigh (Tifinagh) hunspell-ber
ber Amazigh (Latin)
bg Bulgarian hunspell-bg hyphen-bg mythes-bg
bn Bengali hunspell-bn hyphen-bn
bo Tibetan bo.openoffice.org. Latest language support update.
br Breton hunspell-br
bs Bosnian hunspell-bs hyphen-bs
byn Blin Blin Orthography: A History and an Assessment
ca Catalan hunspell-ca hyphen-ca mythes-ca
crh Crimean Tatar A corpus translation team
cs Czech hunspell-cs hyphen-cs mythes-cs
csb Kashubian hunspell-csb
cy Welsh hunspell-cy hyphen-cy
da Danish hunspell-da hyphen-da mythes-da
de German hunspell-de hyphen-de mythes-de
dv Dhivehi

wordlist English-Dhivehi dictionary

dz Dzongkha crubadan corpus building Some requests for help/info.
el Greek hunspell-el hyphen-el mythes-el
en English hunspell-en hyphen-en mythes-en
es Spanish hunspell-es hyphen-es mythes-es
et Estonian hunspell-ee hyphen-et
eu Basque hunspell-eu hyphen-eu
fa Farsi hunspell-fa hyphen-fa
fi Finnish Finnish Community has a parallel Voikko solution. With an enchant backend, an OpenOffice.org extension, and a Firefox extension.
fil Filipino hunspell-tl Filipino is effectively an official Tagalog-based language
fo Faeroese hunspell-fo hyphen-fo
fr French hunspell-fr hyphen-fr mythes-fr
fur Friulian hunspell-fur
fy Frisian hunspell-fy
ga Irish hunspell-ga hyphen-ga mythes-ga
gd Scots Gaelic hunspell-gd
gez Ge'ez Ge'ez Frontier Foundation
gl Galician hunspell-gl hyphen-gl
gu Gujarati hunspell-gu hyphen-gu
gv Manx hunspell-gv
ha Hausa crubadan possible wordlist www.dictionary.kasahorow.com
he Hebrew hunspell-he info on hyphenation
hi Hindi hunspell-hi hyphen-hi Hindi Wordnet is likely convertible, claims to have similar format as English Wordnet, which is the basis of mythes-en
hne Chhattisgarhi corpus building
hr Croatian hunspell-hr hyphen-hr This hasn't been updated in a number of years, on a purely orthographical basis I wonder if dict-sr would provide a better option
hsb Upper Sorbian hunspell-hsb hyphen-hsb
ht Haitian Creole corpus building
hu Hungarian hunspell-hu hyphen-hu mythes-hu
hy Armenian hunspell-hy
id Indonesian hunspell-id hyphen-id
ig Igbo crubadan corpus building www.dictionary.kasahorow.com
ik Inupiaq Broken download link to MSWord dictionary Iñupiaq parser project
is Icelandic hunspell-is hyphen-is
it Italian hunspell-it hyphen-it mythes-it
iu Inuktitut www.livingdictionary.com
ja Japanese
ka Georgian Crubadan is aware of 29023 words ka.openoffice.org Some info on spellchecking the language.
kk Kazakh hunspell-kk
kl Kalaallisut Greenlandic parser project. MSWord checker.
km Khmer hunspell-km
kn Kannada hunspell-kn hyphen-kn
ko Korean hunspell-ko
ks Kashmiri online dictionary
ku Kurdish (Latin) hunspell-ku hyphen-ku
ku Kurdish (Arabic) some info
kw Cornish crubadan corpus building
ky Kirgyz hunspell-ky OOo localization beginnings. Orthography news
lg Luganda crubadan corpus building A general translation effort. An online dictionary
li Limburgish crubadan corpus building
lo Lao Lao OOo localization
lt Lithuanian hunspell-lt hyphen-lt
lv Latvian hunspell-lv hyphen-lv mythes-lv
mai Maithili maithiliacademy.org
mg Malagasy hunspell-mg
mi Maori hunspell-mi hyphen-mi mythes-mi
mk Macedonian hunspell-mk convertible
ml Malayalam hunspell-ml hyphen-ml
mn Mongolian hunspell-mn hyphen-mn
mr Marathi hunspell-mr hyphen-mr
ms Malay hunspell-ms no content, but a project announcement for Malaysian thesaurus etc.
mt Maltese hunspell-mt
nan Min Nan online dictionary?

Debian wiki notes

nb Bokmaal hunspell-nb hyphen-nb mythes-nb
nds Lowlands Saxon hunspell-nds
ne Nepali hunspell-ne available
nl Dutch hunspell-nl hyphen-nl mythes-nl
nn Nynorsk hunspell-nn hyphen-nn mythes-nn
nr Ndebele (Southern) hunspell-nr
nso Sotho (Northern) hunspell-nso
oc Occitan hunspell-oc
om Oromo rhbz#522482 Oromo details
or Oriya hunspell-or hyphen-or
pa Punjabi hunspell-pa hyphen-pa
pap Papiamentu/Papiamento Papiamentu work in progress The supported glibc locale is pap_AN. Spelling rules differ between Papiamentu and Papiamento groupings. Papiamentu: Curaçao and Bonaire, current members of the Netherlands Antillies, territory code AN. Papiamento: Aruba, (former member of the Netherlands Antillies), territory code AW, crubadan Papiamento corpus building.
pl Polish hunspell-pl hyphen-pl mythes-pl
pt Portuguese hunspell-pt hyphen-pt mythes-pt
ro Romanian hunspell-ro hyphen-ro mythes-ro
ru Russian hunspell-ru hyphen-ru mythes-ru
rw Kinyarwanda hunspell-rw
sa Sanskrit An apparent effort to create a Sanskrit hunspell dictionary hyphen-sa
sc Sardinian hunspell-sc
sd Sindhi online dictionary
se Sammi, Northern hunspell-se watch this space
shs Secwepemctsin hunspell-shs Secwepecmtsín word bank work in progress. Note it's trivial to create a simple wordlist-based hunspell dict. e.g. wordlist2hunspell
si Sinhala A very small wordlist
sid Sidamo Some info
sk Slovak hunspell-sk hyphen-sk mythes-sk
sl Slovenian hunspell-sl hyphen-sl mythes-sl
so Somali hunspell-so
sq Albanian hunspell-sq
sr Serbian hunspell-sr hyphen-sr
ss Swati hunspell-ss
st Sotho (Southern) hunspell-st
sv Swedish hunspell-sv hyphen-sv mythes-sv
ta Tamil hunspell-ta hyphen-ta
te Telugu hunspell-te hyphen-te
tg Tajik An apparent effort to create a Tajik hunspell dictionary
th Thai hunspell-th
ti Tigrigna hunspell-ti
tig Tigre crubadan corpus building
tk Turkmen hunspell-tk
tl Tagalog hunspell-tl
tn Tswana hunspell-tn
tr Turkish available, but like Finnish through voikko the typical solution for Turkish has been the Zemberek library, and to have an enchant backend, an Openoffice.org Extension, and a Firefox extension)
ts Tsonga hunspell-ts
tt Tatar available but difficult to see where this came from originally, and what license it is exactly, GPLv2+ (?). Perhaps it is an original work of ALT Linux and that actually is the canonical upstream ? available but difficult to see where this came from originally, and what license it is exactly, GPLv2+ (?). Perhaps it is an original work of ALT Linux and that actually is the canonical upstream ?
ug Uyghur www.uyghurdictionary.org
uk Ukrainian hunspell-uk hyphen-uk mythes-uk
ur Urdu hunspell-ur
uz Uzbek hunspell-uz
ve Venda hunspell-ve
vi Vietnamese hunspell-vi
wa Walloon hunspell-wa
wo Wolof www.alfanet.anafa.org make Wolof localizations of Firefox and Abiword. www.dictionary.kasahorow.com
xh Xhosa hunspell-xh
yi Yiddish available spell-checker
yo Yoruba An apparent effort to create a Yoruba hunspell dictionary www.dictionary.kasahorow.com
zh Chinese Would these (convertable) TeX rules be universally meaningful for Chinese text
zu Zulu hunspell-zu hyphen-zu


2. Language Support Matrix (extra OOo recognized not in glibc)

Language Code Language hunspell hyphen mythes notes
ak Akan hunspell-ak www.dictionary.kasahorow.com
az Azeri (Cyrillic) transliteration table
bm Bambara Online Dictionary
brx Bodo xobdo is a potential source, but this isn't an option apparently at the moment. Another Online Dictionary
cop Coptic hunspell-cop experimental convertible TeX rules
cv Chuvash hunspell-cv
dgo Dogri Central Institute for Indian Languages
dsb Lower Sorbian rhbz#528889
ee Ewe online dictionary
eo Esperanto hunspell-eo needs more love to be convertible
fj Fijian hunspell-fj
grc Ancient Greek rhbz#528893 rhbz#528904
gsc Gascon Non-Commercial BY-NC-ND license
gug Guarani crubadan corpus building
hil Hiligaynon hunspell-hil
ia Interlingua hunspell-ia hyphen-ia
kok Konkani [http://www.savemylanguage.org/ online dictionary
la Latin hunspell-la hyphen-la
lb Luxembourgish available but the EUPL v1.0 is on our licence list as unacceptable for fedora. available but the EUPL v1.0 is on our licence list as unacceptable for fedora.
ln Lingala hunspell-ln
mos Mossi hunspell-mos info. dictionary effort (hunspell has no problem with utf-8 .dic files FWIW)
mni Manipuri some info
my Burmese online dictionary
ny Nyanja hunspell-ny
qu Quechua Ecuador available
quh Quechua South Bolivia hunspell-quh
qul Quechua North Bolivia current effort
rm Raeto-Romance/Romansh Things are a bit messy as there's a group of R[h]aeto-Romance languages, but sil maps the ISO 639-1 rm to ISO 639-3 roh, and ethnologue documents the Swizz Offical Orthography for roh as Rumantsch Grischun, so that's the probable best-fit for this. Dicziunari Rumantsch Grischun
sat Santali English<->Santali dictionaries online dictionary
sg Sango www.dictionary.kasahorow.com
sjd Sammi, Kildin Northern Sammi
sma Sammi, Southern Northern Sammi
smj Sammi, Lule hunspell-smj watch this space Northern Sammi
smn Sammi, Inari Northern Sammi
sms Sammi, Skolt Northern Sammi
sw Swahili hunspell-sw
tet Tetum hunspell-tet
tpi Tok Pisin crubadan corpus building

3. Obsolete/Useless codes (glibc)

Language Code Language notes
iw Hebrew Obsoleted by he
no Norwegian Effectively obsoleted by nb