Fusing Vector Space Models for Domain-Specific Applications
Abstract
We address the problem of tuning word embeddings for specific use cases and domains. We propose a new method that automatically combines multiple domain-specific embeddings, selected from a wide range of pre-trained domain-specific embeddings, to improve their combined expressive power. Our approach relies on two key components: 1) a ranking function, based on a new embedding similarity measure, that selects the most relevant embeddings to use given a domain and 2) a dimensionality reduction method that combines the selected embeddings to produce a more compact and efficient encoding that preserves the expressiveness. We empirically show that our method produces effective domain-specific embeddings that consistently improve the performance of state-of-the-art machine learning algorithms on multiple tasks, compared to generic embeddings trained on large text corpora.
Cite
@article{arxiv.1909.02307,
title = {Fusing Vector Space Models for Domain-Specific Applications},
author = {Laura Rettig and Julien Audiffren and Philippe Cudré-Mauroux},
journal= {arXiv preprint arXiv:1909.02307},
year = {2019}
}
Comments
ICTAI 2019