Multi-modal Similarity Retrieval with a Shared Distributed Data Store



Rok publikování 2015
Druh Článek ve sborníku
Konference Scalable Information Systems: 5th International Conference, INFOSCALE 2014, Seoul, South Korea, September 25-26, 2014, Revised Selected Papers
Fakulta informatiky

Obor Informatika
Klíčová slova similarity search; multi-modal search; Big Data; scalability
Popis We propose a generic system architecture for large-scale similarity search in various types of digital data. The architecture combines contemporary highly-scalable distributed data stores with recent efficient similarity indexes and also with other types of search indexes. The system is designed to provide several types of queries – distance-based similarity queries, term-based queries, attribute queries, and advanced queries combining several search aspects (modalities). The first part of this work is devoted to the generic architecture and to description of a similarity index PPP-Codes that is suitable for our system. In the second part, we describe a specific instance of this architecture that manages a 106 million image collection providing content-based visual search, keyword search, attribute-based access, and their combinations.
