From Fedora Project Wiki
No edit summary
Line 228: Line 228:
| sc || Sardinian          || hunspell-sc  ||          ||          || [http://qa.openoffice.org/issues/show_bug.cgi?id=107288 intended dictionaries] [https://launchpad.net/ditzionariusardu launchpad page
| sc || Sardinian          || hunspell-sc  ||          ||          || [http://qa.openoffice.org/issues/show_bug.cgi?id=107288 intended dictionaries] [https://launchpad.net/ditzionariusardu launchpad page
|-
|-
| sd  || Sindhi            ||              ||            ||          || [http://dsal.uchicago.edu/dictionaries/mewaram/ online dictionary]
| sd  || Sindhi            || [http://extensions.services.openoffice.org/de/project/sindhispellchecker available] ||            ||          ||
|-
|-
| se || Sammi, Northern      || hunspell-se || [http://www.divvun.no/doc/proof/hyph/OOo/index.html watch this space] || ||
| se || Sammi, Northern      || hunspell-se || [http://www.divvun.no/doc/proof/hyph/OOo/index.html watch this space] || ||

Revision as of 13:34, 2 April 2011

Linguistic Components

1. Language Support Matrix (glibc upwards)

Language Code Language hunspell hyphen mythes notes
aa Afar afarfriends.org hosted ALSEC report.
af Afrikaans hunspell-af hyphen-af
am Amharic hunspell-am
an Aragonese www.iea.es, see Spain: Lexicography In Iberian Languages
ar Arabic hunspell-ar experimental thesaurus
as Assamese hunspell-as hyphen-as xobdo is another potential source, possibly even for a thesaurus, but this isn't an option apparently at the moment.
ast Asturian hunspell-ast dictionary announcement
az Azeri (Latin) hunspell-az
be Belarusian hunspell-be hyphen-be
ber Amazigh (Tifinagh) hunspell-ber
ber Amazigh (Latin)
bg Bulgarian hunspell-bg hyphen-bg mythes-bg
bn Bengali hunspell-bn hyphen-bn
bo Tibetan bo.openoffice.org. Latest language support update.
br Breton hunspell-br
bs Bosnian hunspell-bs hyphen-bs
byn Blin Blin Orthography: A History and an Assessment
ca Catalan hunspell-ca hyphen-ca mythes-ca
crh Crimean Tatar A corpus translation team
cs Czech hunspell-cs hyphen-cs mythes-cs
csb Kashubian hunspell-csb
cv Chuvash hunspell-cv
cy Welsh hunspell-cy hyphen-cy
da Danish hunspell-da hyphen-da mythes-da
de German hunspell-de hyphen-de mythes-de
dv Dhivehi

wordlist English-Dhivehi dictionary

dz Dzongkha crubadan corpus building Some requests for help/info.
el Greek hunspell-el hyphen-el mythes-el
en English hunspell-en hyphen-en mythes-en
es Spanish hunspell-es hyphen-es mythes-es
et Estonian hunspell-ee hyphen-et
eu Basque hunspell-eu hyphen-eu
fa Farsi hunspell-fa hyphen-fa
fi Finnish Finnish Community has a parallel Voikko solution. With an enchant backend, an OpenOffice.org extension, and a Firefox extension.
fil Filipino hunspell-tl Filipino is effectively an official Tagalog-based language
fo Faeroese hunspell-fo hyphen-fo
fr French hunspell-fr hyphen-fr mythes-fr
fur Friulian hunspell-fur
fy Frisian hunspell-fy
ga Irish hunspell-ga hyphen-ga mythes-ga
gd Scots Gaelic hunspell-gd
gez Ge'ez Ge'ez Frontier Foundation
gl Galician hunspell-gl hyphen-gl
gu Gujarati hunspell-gu hyphen-gu
gv Manx hunspell-gv
ha Hausa available but no License mentioned. In private communication " We will specify licenses for the next release of the spell checkers. In the meantime, assume both Hausa and Eʋegbe have the GNU GPLv3 license as well."
he Hebrew hunspell-he info on hyphenation
hi Hindi hunspell-hi hyphen-hi Hindi Wordnet is likely convertible, claims to have similar format as English Wordnet, which is the basis of mythes-en
hne Chhattisgarhi corpus building
hr Croatian hunspell-hr hyphen-hr This hasn't been updated in a number of years, on a purely orthographical basis I wonder if dict-sr would provide a better option
hsb Upper Sorbian hunspell-hsb hyphen-hsb
ht Haitian Creole hunspell-ht
hu Hungarian hunspell-hu hyphen-hu mythes-hu
hy Armenian hunspell-hy
id Indonesian hunspell-id hyphen-id
ig Igbo crubadan corpus building www.dictionary.kasahorow.com
ik Inupiaq Broken download link to MSWord dictionary Iñupiaq parser project
is Icelandic hunspell-is hyphen-is
it Italian hunspell-it hyphen-it mythes-it
iu Inuktitut www.livingdictionary.com
ja Japanese
ka Georgian Crubadan is aware of 29023 words ka.openoffice.org Some info on spellchecking the language.
kk Kazakh hunspell-kk
kl Kalaallisut Greenlandic parser project. MSWord checker.
km Khmer hunspell-km
kn Kannada hunspell-kn hyphen-kn
ko Korean hunspell-ko
kok Konkani [http://www.savemylanguage.org/ online dictionary
ks Kashmiri online dictionary
ku Kurdish (Latin) hunspell-ku hyphen-ku
ku Kurdish (Arabic) some info
kw Cornish crubadan corpus building Fedora Cornish Language Translation Project
ky Kirgyz hunspell-ky OOo localization beginnings. Orthography news
lg Luganda crubadan corpus building A general translation effort. An online dictionary
li Limburgish crubadan corpus building
lo Lao Lao OOo localization
lt Lithuanian hunspell-lt hyphen-lt
lv Latvian hunspell-lv hyphen-lv mythes-lv
mai Maithili hunspell-mai maithiliacademy.org
mg Malagasy hunspell-mg mg is equivalent to mlg which is a macrolanguage, see plt for "Standard Malagasy
mi Maori hunspell-mi hyphen-mi mythes-mi
mk Macedonian hunspell-mk convertible
ml Malayalam hunspell-ml hyphen-ml
mn Mongolian hunspell-mn hyphen-mn
mr Marathi hunspell-mr hyphen-mr
ms Malay hunspell-ms no content, but a project announcement for Malaysian thesaurus etc.
mt Maltese hunspell-mt
my Burmese online dictionary
nan Min Nan online dictionary?

Debian wiki notes

nb Bokmaal hunspell-nb hyphen-nb mythes-nb
nds Lowlands Saxon hunspell-nds
ne Nepali hunspell-ne mythes-ne
nl Dutch hunspell-nl hyphen-nl mythes-nl
nn Nynorsk hunspell-nn hyphen-nn mythes-nn
nr Ndebele (Southern) hunspell-nr
nso Sotho (Northern) hunspell-nso
oc Occitan hunspell-oc
om Oromo hunspell-om Oromo details
or Oriya hunspell-or hyphen-or
pa Punjabi hunspell-pa hyphen-pa
pap Papiamentu/Papiamento Papiamentu work in progress The supported glibc locale is pap_AN. Spelling rules differ between Papiamentu and Papiamento groupings. Papiamentu: Curaçao and Bonaire, current members of the Netherlands Antillies, territory code AN. Papiamento: Aruba, (former member of the Netherlands Antillies), territory code AW, crubadan Papiamento corpus building.
pl Polish hunspell-pl hyphen-pl mythes-pl
ps Pashto possible contact
pt Portuguese hunspell-pt hyphen-pt mythes-pt
ro Romanian hunspell-ro hyphen-ro mythes-ro
ru Russian hunspell-ru hyphen-ru mythes-ru
rw Kinyarwanda hunspell-rw
sa Sanskrit An apparent effort to create a Sanskrit hunspell dictionary hyphen-sa
sc Sardinian hunspell-sc intended dictionaries [https://launchpad.net/ditzionariusardu launchpad page
sd Sindhi available
se Sammi, Northern hunspell-se watch this space
shs Secwepemctsin hunspell-shs Secwepecmtsín word bank work in progress. Note it's trivial to create a simple wordlist-based hunspell dict. e.g. wordlist2hunspell
si Sinhala hunspell-si Another very small wordlist
sid Sidamo Some info
sk Slovak hunspell-sk hyphen-sk mythes-sk
sl Slovenian hunspell-sl hyphen-sl mythes-sl
so Somali hunspell-so
sq Albanian hunspell-sq
sr Serbian hunspell-sr hyphen-sr
ss Swati hunspell-ss
st Sotho (Southern) hunspell-st
sv Swedish hunspell-sv hyphen-sv mythes-sv
ta Tamil hunspell-ta hyphen-ta
te Telugu hunspell-te hyphen-te
tg Tajik An apparent effort to create a Tajik hunspell dictionary
th Thai hunspell-th
ti Tigrigna hunspell-ti
tig Tigre crubadan corpus building
tk Turkmen hunspell-tk hyphen-tk
tl Tagalog hunspell-tl
tn Tswana hunspell-tn
tr Turkish available, but like Finnish through voikko the typical solution for Turkish has been the Zemberek library, and to have an enchant backend, an Openoffice.org Extension, and a Firefox extension)
ts Tsonga hunspell-ts
tt Tatar available but difficult to see where this came from originally, and what license it is exactly, GPLv2+ (?). Perhaps it is an original work of ALT Linux and that actually is the canonical upstream ? available but difficult to see where this came from originally, and what license it is exactly, GPLv2+ (?). Perhaps it is an original work of ALT Linux and that actually is the canonical upstream ?
ug Uyghur www.uyghurdictionary.org www.uighur.jp
uk Ukrainian hunspell-uk hyphen-uk mythes-uk
ur Urdu hunspell-ur
uz Uzbek hunspell-uz
ve Venda hunspell-ve
vi Vietnamese hunspell-vi
wa Walloon hunspell-wa
wo Wolof www.alfanet.anafa.org make Wolof localizations of Firefox and Abiword. www.dictionary.kasahorow.com
xh Xhosa hunspell-xh
yi Yiddish hunspell-yi
yo Yoruba Some apparent efforts older info to create a Yoruba hunspell dictionary www.dictionary.kasahorow.com
zh Chinese Would these (convertable) TeX rules be universally meaningful for Chinese text
zu Zulu hunspell-zu hyphen-zu


2. Language Support Matrix (extra OOo recognized not in glibc)

Language Code Language hunspell hyphen mythes notes
ak Akan hunspell-ak www.dictionary.kasahorow.com
az Azeri (Cyrillic) transliteration table
bm Bambara Online Dictionary
buc Bushi
brx Bodo xobdo is a potential source, but this isn't an option apparently at the moment. Another Online Dictionary
cop Coptic hunspell-cop experimental convertible TeX rules
dgo Dogri Central Institute for Indian Languages
dsb Lower Sorbian hunspell-dsb
ee Ewe available but no License mentioned. In private communication " We will specify licenses for the next release of the spell checkers. In the meantime, assume both Hausa and Eʋegbe have the GNU GPLv3 license as well." online dictionary
eo Esperanto hunspell-eo needs more love to be convertible
fj Fijian hunspell-fj
grc Ancient Greek hunspell-grc hyphen-grc
gsc Gascon Non-Commercial BY-NC-ND license
gug Guarani crubadan corpus building
hil Hiligaynon hunspell-hil
ia Interlingua hunspell-ia hyphen-ia
ki Gikuyu available
ksf Bafia work in progress empty dictionary page
la Latin hunspell-la hyphen-la
lb Luxembourgish hunspell-lb mythes-lb
ln Lingala hunspell-ln
ltg Latgalian Latgalian resources
mos Mossi hunspell-mos info. dictionary effort (hunspell has no problem with utf-8 .dic files FWIW)
mni Manipuri some info
ny Nyanja hunspell-ny
plt Malagasy, Plateau hunspell-mg Standard Malagasy
qu Quechua Ecuador hunspell-qu
quh Quechua South Bolivia hunspell-quh
qul Quechua North Bolivia current effort
rm Raeto-Romance/Romansh Things are a bit messy as there's a group of R[h]aeto-Romance languages, but sil maps the ISO 639-1 rm to ISO 639-3 roh, and ethnologue documents the Swizz Offical Orthography for roh as Rumantsch Grischun, so that's the probable best-fit for this. Dicziunari Rumantsch Grischun
rue Rusyn
sat Santali English<->Santali dictionaries online dictionary
sdc Sardinian, Sassarese intended dictionaries [https://launchpad.net/ditzionariusardu launchpad page
sdn Sardinian, Gallurese intended dictionaries [https://launchpad.net/ditzionariusardu launchpad page
sg Sango www.dictionary.kasahorow.com
sjd Sammi, Kildin Northern Sammi
sma Sammi, Southern Northern Sammi
smj Sammi, Lule hunspell-smj watch this space Northern Sammi
smn Sammi, Inari Northern Sammi
sms Sammi, Skolt Northern Sammi
src Sardinian, Logudorese intended dictionaries [https://launchpad.net/ditzionariusardu launchpad page
sro Sardinian, Campidanese intended dictionaries [https://launchpad.net/ditzionariusardu launchpad page
sw Swahili hunspell-sw
swb Maore swb information
tet Tetum hunspell-tet
tpi Tok Pisin available
ty Tahitian crubadan corpus building

3. Obsolete/Useless codes (glibc)

Language Code Language notes
iw Hebrew Obsoleted by he
no Norwegian Effectively obsoleted by nb