Demystifying Embedding Spaces using Large Language Models

Guy Tennenholtz; Yinlam Chow; Chih-Wei Hsu; Jihwan Jeong; Lior Shani; Azamat Tulepbergenov; Deepak Ramachandran; Martin Mladenov; Craig Boutilier

Demystifying Embedding Spaces using Large Language Models

Computation and Language 2024-03-14 v2 Artificial Intelligence Machine Learning

Authors: Guy Tennenholtz , Yinlam Chow , Chih-Wei Hsu , Jihwan Jeong , Lior Shani , Azamat Tulepbergenov , Deepak Ramachandran , Martin Mladenov , Craig Boutilier

View on arXiv ↗ PDF ↗

Abstract

Embeddings have become a pivotal means to represent complex, multi-faceted information about entities, concepts, and relationships in a condensed and useful format. Nevertheless, they often preclude direct interpretation. While downstream tasks make use of these compressed representations, meaningful interpretation usually requires visualization using dimensionality reduction or specialized machine learning interpretability methods. This paper addresses the challenge of making such embeddings more interpretable and broadly useful, by employing Large Language Models (LLMs) to directly interact with embeddings -- transforming abstract vectors into understandable narratives. By injecting embeddings into LLMs, we enable querying and exploration of complex embedding data. We demonstrate our approach on a variety of diverse tasks, including: enhancing concept activation vectors (CAVs), communicating novel embedded entities, and decoding user preferences in recommender systems. Our work couples the immense information potential of embeddings with the interpretative power of LLMs.

Keywords

word embeddings large language model language modeling

Cite

@article{arxiv.2310.04475,
  title  = {Demystifying Embedding Spaces using Large Language Models},
  author = {Guy Tennenholtz and Yinlam Chow and Chih-Wei Hsu and Jihwan Jeong and Lior Shani and Azamat Tulepbergenov and Deepak Ramachandran and Martin Mladenov and Craig Boutilier},
  journal= {arXiv preprint arXiv:2310.04475},
  year   = {2024}
}

Comments

Accepted to ICLR 2024

Demystifying Embedding Spaces using Large Language Models

Abstract

Keywords

Cite

Comments

Related papers