An Update of the Manually Annotated Amharic Corpus


RYCHLÝ Pavel LEMMA Gezahegn Tsegaye

Rok publikování 2018
Druh Článek ve sborníku
Konference Proceedings of the Twelfth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2018
Fakulta informatiky

Klíčová slova text corpus; Amharic corpus; part-of-speech tagging
Popis The paper describes an update of the manually annotated Amharic corpus WIC 2.0. It lists the problems of the previous version of the corpus and shows that even small changes in the corpus annotation could lead to a higher quality of trained part-of-speech taggers.
