Fix the dictionary proliferation problem

Summary

Fix the proliferation of dictionaries in the OS.

Owners

Current status

Usage cases/rationale

We have separate dictionaries for each language for OpenOffice.org, Firefox, Thunderbird, and aspell (which gnome and KDE use). This is dumb.

Benefit to Fedora

We get code reuse, a smaller distribution, and a decreased memory footprint.

Scope

Requires changing the OpenOffice.org, thunderbird, firefox, and dictionary packages.

Test Plan

Test spell checking in all apps.

Dependencies

None.

Details

  1. Split out hunspell from OpenOffice.org - rhbz#214764 complete

  2. Make OpenOffice.org use it - rhbz#214764 complete

  3. Split out the dictionaries into separate packages - rhbz#218769 (english) complete

  4. Make OpenOffice.org use system dictionaries - complete

  5. Make gedit/xchat use it, i.e. enchant. enchant by default already generally prefers using hunspell over aspell, just needs to be told where the dictionaries are - complete

  6. Make evolution use it, i.e. gnome-spell. gnome-spell can be patched to use enchant to achieve this - rhbz#426347 complete

  7. Make tomboy/pidgin use it, i.e. gtkspell. Same story as gnome-spell - rhbz#245888 complete

  8. Make Firefox (and other gecko apps) use it - rhbz#218762 complete, upstream state is now ''resolved''

  9. Make KDE use enchant and/or hunspell - complete - KDE 4 already defaults to enchant in Sonnet. (For K3Spell, see "legacy KSpell" below.) The aspell backend was dropped entirely in Rawhide. For kdelibs3:

    • The legacy KSpell uses command-line spellcheckers. KevinKofler wrote a patch to support hunspell, and kde-settings in Rawhide was changed to make it the default.

    • The newer KSpell2 API is plugin-based and uses libraries. It is what KDE 4's Sonnet is based on. KevinKofler backported Sonnet's enchant backend. The aspell and ispell backends were dropped in Rawhide.

    See the fedora-devel-list message.

  10. Remove copy of hunspell from enchant - rhbz#426402 complete

  11. Remove copy of hunspell from xulrunner complete

  12. Split enchant to have a separate enchant-aspell rpm to enable optionally removing the aspell support - rhbz#426402 complete

  13. Prefer hunspell over aspell as the default for install in comps. See table below for mis-match in language support. rhbz#439037 complete

  14. Repackage/replace the aspell dictionaries with hunspell dictionaries 80% see table below for language support

Optional

  1. Write an aspell compatibility layer so aspell apps can use the same dictionaries no volunteer -> deferred, is this neccessary at all ? All major desktop apps work now out of the box

  2. Make vim use hunspell - rhbz#219777 patch available, not necessary if vim continues to not use any spell-checking, but preferred over introducing built-in vim spellchecker which has yet another format which hunspell dicts are converted to for use

Dictionaries

  1. Language Support Matrix

Language

aspell

hunspell

Afrikaans

aspell-af

hunspell-af

Arabic

aspell-ar

hunspell-ar

Bengali

aspell-bn

hunspell-bn

Bokmaal

aspell-no

hunspell-nb

Breton

aspell-br

Bulgarian

aspell-bg

hunspell-bg

Catalan

aspell-ca

hunspell-ca

Croatian

aspell-hr

hunspell-hr

Czech

aspell-cs

hunspell-cs

Danish

aspell-da

hunspell-da

Dutch

aspell-nl

hunspell-nl

English

aspell-en

hunspell-en

Estonian

hunspell-ee

Faeroese

aspell-fo

hunspell-fo

French

aspell-fr

hunspell-fr

Galician

hunspell-gl

German

aspell-de

hunspell-de

Greek

aspell-el

hunspell-el

Gujarati

aspell-gu

Hebrew

aspell-he

hunspell-he

Hindi

aspell-hi

hunspell-hi

Hungarian

hunspell-hu

Icelandic

aspell-is

available

Indonesian

aspell-id

available

Irish

aspell-ga

hunspell-ga

Italian

aspell-it

hunspell-it

Lithuanian

hunspell-lt

Malay

hunspell-ms

Malayalam

rhbz#403911

available

Marathi

aspell-mr

hunspell-mr

Nynorsk

aspell-no

hunspell-nn

Oriya

aspell-or

hunspell-or

Polish

aspell-pl

hunspell-pl

Portuguese

aspell-pt

hunspell-pt

Punjabi

aspell-pa

hunspell-pa

Russian

aspell-ru

hunspell-ru

Scots Gaelic

aspell-gd

hunspell-gd

Serbian

aspell-sr

available

Slovak

aspell-sk

hunspell-sk

Slovenian

aspell-sl

hunspell-sl

Spanish

aspell-es

hunspell-es

Swedish

aspell-sv

hunspell-sv

Tamil

aspell-ta

hunspell-ta

Telugu

aspell-te

Thai

hunspell-th

Welsh

aspell-cy

hunspell-cy

Zulu

hunspell-zu

User experience

Should not affect user experience.

Contingency plan

Continue to ship older dictionaries.

Documentation

http://hunspell.sourceforge.net/

Release Notes

There is a new default spell checking back-end, hunspell, for both the GNOME and KDE desktops, as well as applications such as OpenOffice.org, Firefox, and other XULRunner-based applications. This common back-end includes a set of shared, multi-lingual dictionaries for use with hunspell. This feature uses a single set of common dictionaries regardless of the application, which gives consistent suggestions for misspelled words and uses less diskpace by eliminating duplicate dictionaries.

Comments

Note that JDS is going down this route as well

A somewhat related issue.

Will help on adding Indic hunspell dictionaries in Fedora - paragn.

php5 and bluefish still link to aspell at least - kmaraas. (It's not practical for me to port everything, just the core default installed components and the default spell-checking solutions for the main desktop environments and applications - caolanm)


CategoryAcceptedFedora9