Morphological Word Embeddings
Computation and Language
2019-07-05 v1
Abstract
Linguistic similarity is multi-faceted. For instance, two words may be similar with respect to semantics, syntax, or morphology inter alia. Continuous word-embeddings have been shown to capture most of these shades of similarity to some degree. This work considers guiding word-embeddings with morphologically annotated data, a form of semi-supervised learning, encouraging the vectors to encode a word's morphology, i.e., words close in the embedded space share morphological features. We extend the log-bilinear model to this end and show that indeed our learned embeddings achieve this, using German as a case study.
Cite
@article{arxiv.1907.02423,
title = {Morphological Word Embeddings},
author = {Ryan Cotterell and Hinrich Schütze},
journal= {arXiv preprint arXiv:1907.02423},
year = {2019}
}
Comments
Published at NAACL 2015