Informace o publikaci

Evaluating Natural Language Processing Tasks with Low Inter-Annotator Agreement: The Case of Corpus Applications


KOVÁŘ Vojtěch

Druh Článek ve sborníku
Konference Tenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2016
Fakulta / Pracoviště MU

Fakulta informatiky

Obor Informatika
Klíčová slova NLP; inter-annotator agreement; low inter-annotator agreement; evaluation; application; application-based evaluation; word sketch; thesaurus; terminology
Popis In Low inter-annotator agreement = an ill-defined problem?, we have argued that tasks with low inter-annotator agreement are really common in natural language processing (NLP) and they deserve an appropriate attention. We have also outlined a preliminary solution for their evaluation. In On evaluation of natural language processing tasks: Is gold standard evaluation methodology a good solution? , we have agitated for extrinsic application-based evaluation of NLP tasks and against the gold standard methodology which is currently almost the only one really used in the NLP field. This paper brings a synthesis of these two: For three practical tasks, that normally have so low inter-annotator agreement that they are considered almost irrelevant to any scentific evaluation, we introduce an application-based evaluation scenario which illustrates that it is not only possible to evaluate them in a scientific way, but that this type of evaluation is much more telling than the gold standard way.
Související projekty: