Informace o publikaci

Rapid prototyping of a web categorization tool

Autoři

NAVRÁTIL Jaromír POPELÍNSKÝ Lubomír

Rok publikování 2014
Druh Článek ve sborníku
Konference IDEAS '14 Proceedings of the 18th International Database Engineering & Applications Symposium
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
www http://dl.acm.org/citation.cfm?id=2628216
Doi http://dx.doi.org/10.1145/2628194.2628216
Obor Informatika
Klíčová slova web mining;categorization of web pages;machine learning;landmarking
Popis This paper introduces a new method for fast prototyping of web page categorization tool based on Random Forests. The result of this work is three-fold. We describe a fast feature extraction method first. Afterwards, we introduce a system that enables a user to perform experiments manually and visualize the results via visual analytics module. The last part of this work concerns a way how to perform experiments efficiently. It is partially inspired by landmarking that allows limiting the number of experiments. This method has been used for building a new commercial system for web categorization that significantly outperforms the system already being used.

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.

Další info