Handwriting Recognizer writrecogn allows users to input Chinese characters used in Chinese, Japanese and Korean with a mouse or tablet.
- Name: DingyiChen
- Targeted release: N/A
- Last updated: 2008-03-25
- Percentage of completion: 70%
- 0.1.8 Released
- [TODO] Need to connect to SCIM
Handwriting Recognizer recognizes Chinese handwritten characters for Chinese, Japanese, and Korean (CJK) and will interface to input methods such as SCIM. Unlike some implementations which require to build a huge set of character recognition rules, we recognize radicals of Chinese characters, i.e. the word root of the character, then use a character-structure-based input method to search for the word. This saves us from writing recognition rules for tens of thousands of CJK characters. This should provide better recognition accuracy than current open source handwriting recognition libraries, like tomoe.
The main program which provides the GUI is
WritRecogn and there is a commandline character data maintenance program
Other features include:
- Stroke editor: users can input new characters for the recognizer to learn.
Benefit to Fedora
- Enable users who have little knowledge of CJK input methods to write Chinese characters.
- Suitable for keyboardless handheld devices.
- Technique can be extended to OCR.
Currently recognizes Chinese characters as used in Traditional and Simplified Chinese hanzi, Japanese kanji, and Korean hanja.
- test that it is possible to input Chinese characters smoothly
- make sure that window focus and input to other applications works correctly
- profiling to check performance
- User can input Chinese characters by handwriting.
writrecognpackage needs to be reviewed, accepted and built in Fedora.
- Need integration with SCIM.
None needed: it is a new package. Tomoe and conventional input methods for Chinese characters will still be available.
writrecogn is a new handwritten input system for Chinese characters written by Ding Chen.
- 2007-05-07: Development started
- Some milestones:
- GObjectize RawCharacter, RawStroke, CharacterMatcher, StrokeRecognizer, StrokeNoiseReducer
- SQLite backend
- Can import from SCIM Tomoe XML data base.
- 2008-01: registered project on SourceForge as writrecogn
- 2008-01-21: initial release of version 0.1 on SourceForge
- Public Release as beta version if the recognizer recognize level 0 radicals (strokes).
- Gather and merge the community contributed stroke data and recognition hypotheses.
- Release the revised stroke data and recognition hypotheses, receive feedback and comments, goto 2. if necessary.
- Ver 0.2
- SQLitize Raw Character List (The list that hold all character).
- Implementation of Relative Radical Bounding box.
- Ver 0.3
- A brief document about Relative Radical Bounding box and "Radical Textbook".
A radical Textbook is a collection of characters and their corresponding sub-radical combinations, which are represented as set of relative radical bounding box .
- Radical Textbook importer (TUI)
- Radical Textbook editor (GUI)
- Ver 0.4
- Convert the stroke-sequence to Radical Textbook, so user does not need to know the exact sequences.
- Character Matcher that can handle the Radical Relative Bounding box.
- Ver 0.5 (Alpha)
- Link to SCIM
- Fuzzy Character Matcher (error tolerence)
- Pack as RPM
- Ver 0.6
- Incremental learning machine (apply incremental SVM or other algorithm)
- English characters recognition.
- Number recognition.
- Commonly used symbol recognition.
- Ver 0.7 (Beta)
- Make Traditional Chinese, Japanese and Korean Radical Text book.
- CJK synonyms (The character that share the same meaning).
- CJK I/O switch. For example, user might input simlified Chinese but wish to output Traditional Chinese and vise versa.
- Ver 0.8
- Evaluation framework
- Plugin framework (such as Stroke Noise Reducer, Stroke Recognizer, Character Matcher)
- Ver 0.9
- Research paper about this project.
- Help documentation.
- Ver 1.0 (Official release)
- Improve stroke editor/trainer interface.
- Double writing canvas
- Hot keys.
- Ver 1.1
- Transparent canvas
- Frameless window
- UniHan support: Show the character information from UniHan
Help is always welcome, several things need to be discussed and done: 1. Algorithm Plugins (such as Stroke Noise Reducer, StrokeRecognizer, CharacterMatcher) 2. UI 3. Radical Textbooks 4. Help documentaion (User and developer) 5. Feature ideas. Please join the writRecogn project on SourceForge, your efforts will be appreciated.
Please put your feature request here.