Informace o publikaci

On Disambiguation in Czech Corpora

Autoři

POPELÍNSKÝ Lubomír PAVELEK Tomáš PTÁČNÍK Tomáš

Rok publikování 2000
Druh Výzkumná zpráva
Fakulta / Pracoviště MU

Fakulta informatiky

Popis Lemma disambiguation means finding the basic word form, typically nominative singular for nouns or infinitive for verbs. We developed a multistrategy method for lemma disambiguation of unannotated text. The method is based on a combination of inductive logic programming and instance-based learning. We present results of the most important subtasks of lemma disambiguation for Czech language. Although no expert knowledge on Czech grammar has been used the accuracy reaches 90% with a fraction of words remaining ambiguous. We also display first results of tag disambiguation.
Související projekty:

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.

Další info