English

Encoding Prior Knowledge with Eigenword Embeddings

Computation and Language 2016-07-28 v3

Abstract

Canonical correlation analysis (CCA) is a method for reducing the dimension of data represented using two views. It has been previously used to derive word embeddings, where one view indicates a word, and the other view indicates its context. We describe a way to incorporate prior knowledge into CCA, give a theoretical justification for it, and test it by deriving word embeddings and evaluating them on a myriad of datasets.

Keywords

Cite

@article{arxiv.1509.01007,
  title  = {Encoding Prior Knowledge with Eigenword Embeddings},
  author = {Dominique Osborne and Shashi Narayan and Shay B. Cohen},
  journal= {arXiv preprint arXiv:1509.01007},
  year   = {2016}
}

Comments

in Transactions of the Association of Computational Linguistics (TACL), 2016

R2 v1 2026-06-22T10:48:12.183Z