Informace o publikaci

Towards New Czechoslovak Hyphenation Patterns

Autoři

SOJKA Petr SOJKA Ondřej

Rok publikování 2020
Druh Článek v odborném periodiku
Časopis / Zdroj Zpravodaj CSTUG
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
www journal landing page, DOI
Doi https://doi.org/10.5300/2020-3-4/118
Klíčová slova hyphenation; patttern generation; Czechoslovak hyphenation patterns; word list database; patgen; multilingual typesetting; Unicode; TeX; syllable segmentation; syllabification; Czech; Slovak; compression
Popis Space- and time-effective segmentation and hyphenation of natural languages stay at the core of every document preparation system, web browser, or mobile rendering system. Recently, the unreasonable effectiveness of pattern generation has been shown – it is possible to use hyphenation patterns to solve the dictionary problem for a single language without compromise. In this article, we will show how we applied the marvelous effectiveness of patgen for the generation of the new Czechoslovak hyphenation patterns that cover two languages. We show that the development of more universal hyphenation patterns is feasible, allows for significant quality improvements and space savings. We evaluate the new approach and the new Czechoslovak hyphenation patterns.

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.

Další info