Next: Morphological Lexicon Up: Data for Applications Previous: Data for Applications

Spelling and Restoration of Diacritics

There are various algorithms for spelling correction and restoration of diacritics , so they may require various data. Most of them perform a simple dictionary lookup, and if the word is not found, they arrive with a list of similar words. These algorithms need a simple list of words for their lexicons. Thus, a word in the language accepted by the automaton is a word of the natural language without additional annotations.

More advanced algorithms check the word's context. This is done with a form of syntactic analysis, which can be full or shallow. Syntactic analysis requires morphological data, such as that described in the next section.

Jan Daciuk
Wed Jun 3 14:37:17 CEST 1998

Software at http://www.pg.gda.pl/~jandac/fsa.html