Informace o publikaci

EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk

Autoři

AYETIRAN Eniafe Festus SOJKA Petr NOVOTNÝ Vít

Rok publikování 2020
Druh Článek v odborném periodiku
Časopis / Zdroj Knowledge-Based Systems
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
Klíčová slova Multi-sense embeddings; Graph walk; Language generation; Distributional semantics; Distributional structures; Word sense disambiguation; Knowledge-based systems; Word similarity; Semantic applications
Popis Several language applications often require word semantics as a core part of their processing pipeline either as precise meaning inference or semantic similarity. Multi-sense embeddings (M-SE) can be exploited for this important requirement. M-SE seeks to represent each word by their distinct senses in order to resolve the conflation of meanings of words as used in different contexts. Previous works usually approach this task by training a model on a large corpus and often ignore the effect and usefulness of the semantic relations offered by lexical resources. However, even with large data, coverage of all possible word senses is still an issue. Also, a considerable percent of contextual semantic knowledge are never learned because a huge amount of possible distributional semantic structures are never explored. In this paper, we leverage the rich semantic structures in WordNet using a graph-theoretic walk technique over word senses to enhance the quality of multi-sense embeddings. This algorithm composes enriched texts from the original texts. Furthermore, we derive new distributional similarity measures for M-SE from prior ones. We adapt these measures to word sense disambiguation (WSD). We report evaluation results on 11 benchmark datasets involving WSD and word similarity tasks and show that despite the small training data, our method for enhancing distributional semantic structures improves embeddings quality. It achieves state-of-the-art performance on some of the datasets.
Související projekty: