The Art of Mathematics Retrieval (invited talk at Informatics Colloquium FI MU, 8.11.2011)
|Year of publication
|MU Faculty or unit
|The design and architecture of MIaS (Math Indexer and Searcher), a~system for mathematics retrieval is presented, and design decisions are discussed. We argue for an approach based on Presentation MathML using a~similarity of math subformulae. The system was implemented as a~math-aware search engine based on the state-of-the-art system Apache Lucene and is used in The European Digital Mathematics Library - EuDML. Scalability issues were checked against more than 400,000 arXiv documents with 158 million mathematical formulae. Almost three billion MathML subformulae were indexed using a~Solr-compatible Lucene.