Czech Question Answering with Extended SQAD v3.0 Benchmark Dataset


SABOL Radoslav MEDVEĎ Marek HORÁK Aleš

Year of publication 2019
Type Article in Proceedings
Conference Proceedings of the Thirteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2019
Faculty of Informatics

Keywords question answering; QA benchmark dataset; SQAD; Czech
Description In this paper, we introduce a new version of the Simple QuestionAnswering Databases (SQAD). The main asset of the new version lies inincreasing the number of records to a total of 13,473 records. Besides thedatabase enlargement, the new version incorporates new restrictions ofspecifying different formats of the expected answer for a given question.These new restrictions are connected with automatic database consistencychecks where new sub-processes safeguard the database correctness andconsistency.We also introduce a new on-line annotation tool used which offered aunified environment for extending the SQAD data in a crowdsourcingexperiment.
