Publication details

On Disambiguation in Czech Corpora

Authors

POPELÍNSKÝ Lubomír PAVELEK Tomáš PTÁČNÍK Tomáš

Year of publication 2000
MU Faculty or unit

Faculty of Informatics

Description Lemma disambiguation means finding the basic word form, typically nominative singular for nouns or infinitive for verbs. We developed a multistrategy method for lemma disambiguation of unannotated text. The method is based on a combination of inductive logic programming and instance-based learning. We present results of the most important subtasks of lemma disambiguation for Czech language. Although no expert knowledge on Czech grammar has been used the accuracy reaches 90% with a fraction of words remaining ambiguous. We also display first results of tag disambiguation.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info