Zde se nacházíte:
Informace o publikaci
Rapid prototyping of a web categorization tool
| Autoři | |
|---|---|
| Rok publikování | 2014 |
| Druh | Článek ve sborníku |
| Konference | IDEAS '14 Proceedings of the 18th International Database Engineering & Applications Symposium |
| Fakulta / Pracoviště MU | |
| Citace | |
| www | http://dl.acm.org/citation.cfm?id=2628216 |
| Doi | https://doi.org/10.1145/2628194.2628216 |
| Obor | Informatika |
| Klíčová slova | web mining;categorization of web pages;machine learning;landmarking |
| Popis | This paper introduces a new method for fast prototyping of web page categorization tool based on Random Forests. The result of this work is three-fold. We describe a fast feature extraction method first. Afterwards, we introduce a system that enables a user to perform experiments manually and visualize the results via visual analytics module. The last part of this work concerns a way how to perform experiments efficiently. It is partially inspired by landmarking that allows limiting the number of experiments. This method has been used for building a new commercial system for web categorization that significantly outperforms the system already being used. |