From Fedora Project Wiki

Linguistic Components

1. Language Support Matrix (glibc upwards)

Language Code Language hunspell hyphen mythes notes
aa Afar afarfriends.org hosted ALSEC report.
af Afrikaans hunspell-af hyphen-af
am Amharic hunspell-am
an Aragonese www.iea.es, see Spain: Lexicography In Iberian Languages
ar Arabic hunspell-ar experimental thesaurus
as Assamese hunspell-as hyphen-as xobdo is another potential source, possibly even for a thesaurus, but this isn't an option apparently at the moment.
ast Asturian crubadan corpus building www.academiadelallingua.com, see Spain: Lexicography In Iberian Languages. Asturian translation team
az Azeri (Latin) hunspell-az
be Belarusian hunspell-be hyphen-be
ber Amazigh (Tifinagh) hunspell-ber
ber Amazigh (Latin)
bg Bulgarian hunspell-bg hyphen-bg mythes-bg
bn Bengali hunspell-bn hyphen-bn
bo Tibetan bo.openoffice.org. Latest language support update.
br Breton hunspell-br
bs Bosnian hunspell-bs hyphen-bs
byn Blin Blin Orthography: A History and an Assessment
ca Catalan hunspell-ca hyphen-ca mythes-ca
crh Crimean Tatar A corpus translation team
cs Czech hunspell-cs hyphen-cs mythes-cs
csb Kashubian hunspell-csb
cy Welsh hunspell-cy hyphen-cy
da Danish hunspell-da hyphen-da mythes-da
de German hunspell-de hyphen-de mythes-de
dv Dhivehi

wordlist English-Dhivehi dictionary

dz Dzongkha crubadan corpus building Some requests for help/info.
el Greek hunspell-el hyphen-el mythes-el
en English hunspell-en hyphen-en mythes-en
es Spanish hunspell-es hyphen-es mythes-es
et Estonian hunspell-ee hyphen-et
eu Basque hunspell-eu hyphen-eu
fa Farsi hunspell-fa hyphen-fa
fi Finnish Finnish Community has a parallel Voikko solution. With an enchant backend, an OpenOffice.org extension, and a Firefox extension.
fil Filipino hunspell-tl Filipino is effectively an official Tagalog-based language
fo Faeroese hunspell-fo hyphen-fo
fr French hunspell-fr hyphen-fr mythes-fr
fur Friulian hunspell-fur
fy Frisian hunspell-fy
ga Irish hunspell-ga hyphen-ga mythes-ga
gd Scots Gaelic hunspell-gd
gez Ge'ez Ge'ez Frontier Foundation
gl Galician hunspell-gl hyphen-gl
gu Gujarati hunspell-gu hyphen-gu
gv Manx hunspell-gv
ha Hausa available but no License mentioned
he Hebrew hunspell-he info on hyphenation
hi Hindi hunspell-hi hyphen-hi Hindi Wordnet is likely convertible, claims to have similar format as English Wordnet, which is the basis of mythes-en
hne Chhattisgarhi corpus building
hr Croatian hunspell-hr hyphen-hr This hasn't been updated in a number of years, on a purely orthographical basis I wonder if dict-sr would provide a better option
hsb Upper Sorbian hunspell-hsb hyphen-hsb
ht Haitian Creole hunspell-ht
hu Hungarian hunspell-hu hyphen-hu mythes-hu
hy Armenian hunspell-hy
id Indonesian hunspell-id hyphen-id
ig Igbo crubadan corpus building www.dictionary.kasahorow.com
ik Inupiaq Broken download link to MSWord dictionary Iñupiaq parser project
is Icelandic hunspell-is hyphen-is
it Italian hunspell-it hyphen-it mythes-it
iu Inuktitut www.livingdictionary.com
ja Japanese
ka Georgian Crubadan is aware of 29023 words ka.openoffice.org Some info on spellchecking the language.
kk Kazakh hunspell-kk
kl Kalaallisut Greenlandic parser project. MSWord checker.
km Khmer hunspell-km
kn Kannada hunspell-kn hyphen-kn
ko Korean hunspell-ko
ks Kashmiri online dictionary
ku Kurdish (Latin) hunspell-ku hyphen-ku
ku Kurdish (Arabic) some info
kw Cornish crubadan corpus building
ky Kirgyz hunspell-ky OOo localization beginnings. Orthography news
lg Luganda crubadan corpus building A general translation effort. An online dictionary
li Limburgish crubadan corpus building
lo Lao Lao OOo localization
lt Lithuanian hunspell-lt hyphen-lt
lv Latvian hunspell-lv hyphen-lv mythes-lv
mai Maithili hunspell-mai maithiliacademy.org
mg Malagasy hunspell-mg
mi Maori hunspell-mi hyphen-mi mythes-mi
mk Macedonian hunspell-mk convertible
ml Malayalam hunspell-ml hyphen-ml
mn Mongolian hunspell-mn hyphen-mn
mr Marathi hunspell-mr hyphen-mr
ms Malay hunspell-ms no content, but a project announcement for Malaysian thesaurus etc.
mt Maltese hunspell-mt
nan Min Nan online dictionary?

Debian wiki notes

nb Bokmaal hunspell-nb hyphen-nb mythes-nb
nds Lowlands Saxon hunspell-nds
ne Nepali hunspell-ne rhbz#574047
nl Dutch hunspell-nl hyphen-nl mythes-nl
nn Nynorsk hunspell-nn hyphen-nn mythes-nn
nr Ndebele (Southern) hunspell-nr
nso Sotho (Northern) hunspell-nso
oc Occitan hunspell-oc
om Oromo hunspell-om Oromo details
or Oriya hunspell-or hyphen-or
pa Punjabi hunspell-pa hyphen-pa
pap Papiamentu/Papiamento Papiamentu work in progress The supported glibc locale is pap_AN. Spelling rules differ between Papiamentu and Papiamento groupings. Papiamentu: Curaçao and Bonaire, current members of the Netherlands Antillies, territory code AN. Papiamento: Aruba, (former member of the Netherlands Antillies), territory code AW, crubadan Papiamento corpus building.
pl Polish hunspell-pl hyphen-pl mythes-pl
ps Pashto possible contact
pt Portuguese hunspell-pt hyphen-pt mythes-pt
ro Romanian hunspell-ro hyphen-ro mythes-ro
ru Russian hunspell-ru hyphen-ru mythes-ru
rw Kinyarwanda hunspell-rw
sa Sanskrit An apparent effort to create a Sanskrit hunspell dictionary hyphen-sa
sc Sardinian hunspell-sc
sd Sindhi online dictionary
se Sammi, Northern hunspell-se watch this space
shs Secwepemctsin hunspell-shs Secwepecmtsín word bank work in progress. Note it's trivial to create a simple wordlist-based hunspell dict. e.g. wordlist2hunspell
si Sinhala hunspell-si Another very small wordlist
sid Sidamo Some info
sk Slovak hunspell-sk hyphen-sk mythes-sk
sl Slovenian hunspell-sl hyphen-sl mythes-sl
so Somali hunspell-so
sq Albanian hunspell-sq
sr Serbian hunspell-sr hyphen-sr
ss Swati hunspell-ss
st Sotho (Southern) hunspell-st
sv Swedish hunspell-sv hyphen-sv mythes-sv
ta Tamil hunspell-ta hyphen-ta
te Telugu hunspell-te hyphen-te
tg Tajik An apparent effort to create a Tajik hunspell dictionary
th Thai hunspell-th
ti Tigrigna hunspell-ti
tig Tigre crubadan corpus building
tk Turkmen hunspell-tk rhbz#574053
tl Tagalog hunspell-tl
tn Tswana hunspell-tn
tr Turkish available, but like Finnish through voikko the typical solution for Turkish has been the Zemberek library, and to have an enchant backend, an Openoffice.org Extension, and a Firefox extension)
ts Tsonga hunspell-ts
tt Tatar available but difficult to see where this came from originally, and what license it is exactly, GPLv2+ (?). Perhaps it is an original work of ALT Linux and that actually is the canonical upstream ? available but difficult to see where this came from originally, and what license it is exactly, GPLv2+ (?). Perhaps it is an original work of ALT Linux and that actually is the canonical upstream ?
ug Uyghur www.uyghurdictionary.org www.uighur.jp
uk Ukrainian hunspell-uk hyphen-uk mythes-uk
ur Urdu hunspell-ur
uz Uzbek hunspell-uz
ve Venda hunspell-ve
vi Vietnamese hunspell-vi
wa Walloon hunspell-wa
wo Wolof www.alfanet.anafa.org make Wolof localizations of Firefox and Abiword. www.dictionary.kasahorow.com
xh Xhosa hunspell-xh
yi Yiddish available spell-checker
yo Yoruba Some apparent efforts older info to create a Yoruba hunspell dictionary www.dictionary.kasahorow.com
zh Chinese Would these (convertable) TeX rules be universally meaningful for Chinese text
zu Zulu hunspell-zu hyphen-zu


2. Language Support Matrix (extra OOo recognized not in glibc)

Language Code Language hunspell hyphen mythes notes
ak Akan hunspell-ak www.dictionary.kasahorow.com
az Azeri (Cyrillic) transliteration table
bm Bambara Online Dictionary
brx Bodo xobdo is a potential source, but this isn't an option apparently at the moment. Another Online Dictionary
cop Coptic hunspell-cop experimental convertible TeX rules
cv Chuvash hunspell-cv
dgo Dogri Central Institute for Indian Languages
dsb Lower Sorbian hunspell-dsb
ee Ewe available but no License mentioned online dictionary
eo Esperanto hunspell-eo needs more love to be convertible
fj Fijian hunspell-fj
grc Ancient Greek hunspell-grc hyphen-grc
gsc Gascon Non-Commercial BY-NC-ND license
gug Guarani crubadan corpus building
hil Hiligaynon hunspell-hil
ia Interlingua hunspell-ia hyphen-ia
kok Konkani [http://www.savemylanguage.org/ online dictionary
la Latin hunspell-la hyphen-la
lb Luxembourgish hunspell-lb mythes-lb
ln Lingala hunspell-ln
mos Mossi hunspell-mos info. dictionary effort (hunspell has no problem with utf-8 .dic files FWIW)
mni Manipuri some info
my Burmese online dictionary
ny Nyanja hunspell-ny
qu Quechua Ecuador hunspell-qu
quh Quechua South Bolivia hunspell-quh
qul Quechua North Bolivia current effort
rm Raeto-Romance/Romansh Things are a bit messy as there's a group of R[h]aeto-Romance languages, but sil maps the ISO 639-1 rm to ISO 639-3 roh, and ethnologue documents the Swizz Offical Orthography for roh as Rumantsch Grischun, so that's the probable best-fit for this. Dicziunari Rumantsch Grischun
sat Santali English<->Santali dictionaries online dictionary
sg Sango www.dictionary.kasahorow.com
sjd Sammi, Kildin Northern Sammi
sma Sammi, Southern Northern Sammi
smj Sammi, Lule hunspell-smj watch this space Northern Sammi
smn Sammi, Inari Northern Sammi
sms Sammi, Skolt Northern Sammi
sw Swahili hunspell-sw
tet Tetum hunspell-tet
tpi Tok Pisin crubadan corpus building

3. Obsolete/Useless codes (glibc)

Language Code Language notes
iw Hebrew Obsoleted by he
no Norwegian Effectively obsoleted by nb