Informace o publikaci

Assessing the Quality of Spatio-textual Datasets in the Absence of Ground Truth



Druh Článek ve sborníku
Konference Proceedings of the 21st European Conference on Advances in Databases and Information Systems
Fakulta / Pracoviště MU

Fakulta informatiky Ústav výpočetní techniky

WWW Springer, CORE B conference, SCOPUS, WoS, DBLP
Obor Informatika
Klíčová slova spatio-textual data; data quality; relative quality
Popis The increasing availability of enriched geospatial data has opened up a new domain and enables the development of more sophisticated location-based services and applications. However, this development has also given rise to various data quality problems as it is very hard to verify the data for all real-world entities contained in a dataset. In this paper, we propose ARCI, a relative quality indicator which exploits the vast availability of spatio-textual datasets, to indicate how confident a user can be in the correctness of a given dataset. ARCI operates in the absence of ground truth and aims at computing the relative quality of an input dataset by cross-referencing its entries among various similar datasets. We also present an algorithm for computing ARCI and we evaluate its performance in a preliminary experimental evaluation using real-world datasets.

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.

Další info