European Union Language Resources in Sketch Engine
|Year of publication
|Article in Proceedings
|Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
|MU Faculty or unit
|JRC-Acquis; DCEP; DGT-TM; Europarl; EUR-Lex; Sketch Engine; parallel corpus; word sketch; parallel concordance
|Several parallel corpora built from European Union language resources are presented here. They were processed by state-of-the-art tools and made available for researchers in the Sketch Engine corpus management system. A completely new resource is introduced: EUR-Lex corpus, being one of the largest parallel corpus available at the moment, containing 840 million tokens of English and having the largest language pair (English-French) with more than 25 million aligned segments (paragraphs).