Publication details

Rapid prototyping of a web categorization tool

Authors

NAVRÁTIL Jaromír POPELÍNSKÝ Lubomír

Year of publication 2014
Type Article in Proceedings
Conference IDEAS '14 Proceedings of the 18th International Database Engineering & Applications Symposium
MU Faculty or unit

Faculty of Informatics

Citation
Web http://dl.acm.org/citation.cfm?id=2628216
Doi http://dx.doi.org/10.1145/2628194.2628216
Field Informatics
Keywords web mining;categorization of web pages;machine learning;landmarking
Description This paper introduces a new method for fast prototyping of web page categorization tool based on Random Forests. The result of this work is three-fold. We describe a fast feature extraction method first. Afterwards, we introduce a system that enables a user to perform experiments manually and visualize the results via visual analytics module. The last part of this work concerns a way how to perform experiments efficiently. It is partially inspired by landmarking that allows limiting the number of experiments. This method has been used for building a new commercial system for web categorization that significantly outperforms the system already being used.

You are running an old browser version. We recommend updating your browser to its latest version.

More info