Project details


Pattern Recognition-based Statistically Enhanced MT (PRESEMT)

Project Identification:248307
Project Period:1/2010 - 12/2012
Investor:link to a new windowEuropean Union
Programme / Project Type:7th Specific RTD Programme - Cooperation
MU Faculty/Unit:
Faculty of Informatics
MU Investigator:Assoc. Prof. PhDr. Karel Pala, CSc.
Project Team Member:Assoc. Prof. RNDr. Aleš Horák, Ph.D.
Assoc. Prof. Mgr. Pavel Rychlý, Ph.D.
Cooperating Organization:
link to a new windowInstitute for Language and Speech Processing
Responsible Person:George Tambouratzis
link to a new windowGesellschaft zurFörderung angewandter Informatik
Lexical Computing Ltd.
link to a new windowNational Technical University of Athens
link to a new windowNorwegian University of Science and Technology

This proposal describes PRESEMT, a flexible and adaptable MT system, based on a language-independent method, whose principles ensure easy portability to new language pairs. This method attempts to overcome well-known problems of other MT approaches, e.g. bilingual corpora compilation or creation of new rules per language pair. PRESEMT will address the issue of effectively managing multilingual content and is expected to suggest a language-independent machine-learning-based methodology. The key aspects of PRESEMT involve syntactic phrase-based modelling, pattern recognition approaches (such as extended clustering or neural networks) or game theory techniques towards the development of a language-independent analysis, evolutionary algorithms for system optimisation. It is intended to be of a hybrid nature, combining linguistic processing with the positive aspects of corpus-based approaches, such as SMT and EBMT.