Publication details

Frequency of Low-Frequency Words in Text Corpora

Investor logo
Investor logo


Year of publication 2010
Type Article in Proceedings
Conference Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2010
MU Faculty or unit

Faculty of Informatics

Field Linguistics
Keywords Computational linguistics Language model; Low-frequency; Text analysis; Text corpora
Description Low-frequency words, esp. words occurring only once in a text corpus, are very popular in text analysis. Also many lexicographers draw attention to such words. This paper lists a detailed statistical analysis of low-frequency words. The results provides important information for many practical applications, including lexicography and language modeling.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info