Publication details

Dictionary Express: First Phases Rapid dictionary-making method for European, Asian and other languages

Authors

KOVAŘÍK František BLAHUŠ Marek CUKR Michal JAKUBÍČEK Miloš KOVÁŘ Vojtěch

Year of publication 2024
Type Article in Periodical
Magazine / Source AsiaLex 2024 Proceedings: Asian Lexicography - Merging cutting-edge and established approaches
Citation
Web https://asialex2024.org/conference-program/
Keywords corpus annotation, semi-automatic lexicography, Dictionary Express, dictionary drafting, post-editing lexicography
Description Dictionary Express (DE) is a new methodology combining automatic tools for lexicography and manual checking (annotation) of words, their forms, usage etc. The main goal of the project is to accelerate dictionary making faster and less demanding by separating the process into simple tasks, as opposed to the traditional dictionaries made entry-by-entry. This means the non-automatic work can be done by a small team of native speakers who are not professional linguists, supervised by a smaller team of developers and lexicographers. The data is acquired from big corpora of current web language usage, which helps the dictionary to be more accurate and up to date with the current language trends. In the past, several "rapid dictionaries" have been created using this method. The time needed to complete a DE project depends on the quality of the tagging of the corpus and the amount of the weekly workload. A DE project for Czech is now in the making, and apart from creating a new Czech dictionary, it focuses on analysing the rapid dictionary-making process and the input/output data. In this paper, we present the main annotation tasks of the DE methodology, the data preparation, and some interesting phenomena that occurred during the first phases of the Czech Dictionary Express.

You are running an old browser version. We recommend updating your browser to its latest version.

More info