Related papers: Reconstructing Maps from Text

Improving Sparse Word Representations with Distributional Inference for Semantic Composition

Distributional models are derived from co-occurrences in a corpus, where only a small proportion of all possible plausible co-occurrences will be observed. This results in a very sparse vector space, requiring a mechanism for inferring…

Computation and Language · Computer Science 2016-08-25 Thomas Kober , Julie Weeds , Jeremy Reffin , David Weir

World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings

Recent work interprets the linear recoverability of geographic and temporal variables from large language model (LLM) hidden states as evidence for world-like internal representations. We test a simpler possibility: that much of the…

Computation and Language · Computer Science 2026-03-05 Elan Barenholtz

Cross-topic distributional semantic representations via unsupervised mappings

In traditional Distributional Semantic Models (DSMs) the multiple senses of a polysemous word are conflated into a single vector space representation. In this work, we propose a DSM that learns multiple distributional representations of a…

Computation and Language · Computer Science 2019-04-12 Eleftheria Briakou , Nikos Athanasiou , Alexandros Potamianos

Can Network Embedding of Distributional Thesaurus be Combined with Word Vectors for Better Representation?

Distributed representations of words learned from text have proved to be successful in various natural language processing tasks in recent times. While some methods represent words as vectors computed from text using predictive model…

Computation and Language · Computer Science 2018-02-20 Abhik Jana , Pawan Goyal

DMAP: A Distribution Map for Text

Large Language Models (LLMs) are a powerful tool for statistical text analysis, with derived sequences of next-token probability distributions offering a wealth of information. Extracting this signal typically relies on metrics such as…

Computation and Language · Computer Science 2026-05-15 Tom Kempton , Julia Rozanova , Parameswaran Kamalaruban , Maeve Madigan , Karolina Wresilo , Yoann L. Launay , David Sutton , Stuart Burrell

Definition Modeling: Learning to define word embeddings in natural language

Distributed representations of words have been shown to capture lexical semantics, as demonstrated by their effectiveness in word similarity and analogical relation tasks. But, these tasks only evaluate lexical semantics indirectly. In this…

Computation and Language · Computer Science 2016-12-02 Thanapon Noraset , Chen Liang , Larry Birnbaum , Doug Downey

Multilingual Models for Compositional Distributed Semantics

We present a novel technique for learning semantic representations, which extends the distributional hypothesis to multilingual data and joint-space embeddings. Our models leverage parallel data and learn to strongly align the embeddings of…

Computation and Language · Computer Science 2014-04-21 Karl Moritz Hermann , Phil Blunsom

Rehabilitation of Count-based Models for Word Vector Representations

Recent works on word representations mostly rely on predictive models. Distributed word representations (aka word embeddings) are trained to optimally predict the contexts in which the corresponding words tend to appear. Such models have…

Computation and Language · Computer Science 2015-04-10 Rémi Lebret , Ronan Collobert

Multilingual Distributed Representations without Word Alignment

Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not…

Computation and Language · Computer Science 2014-03-21 Karl Moritz Hermann , Phil Blunsom

A Generative Model of Words and Relationships from Multiple Sources

Neural language models are a powerful tool to embed words into semantic vector spaces. However, learning such models generally relies on the availability of abundant and diverse training examples. In highly specialised domains this…

Computation and Language · Computer Science 2015-12-04 Stephanie L. Hyland , Theofanis Karaletsos , Gunnar Rätsch

Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings

The words of a language reflect the structure of the human mind, allowing us to transmit thoughts between individuals. However, language can represent only a subset of our rich and detailed cognitive architecture. Here, we ask what kinds of…

Computation and Language · Computer Science 2018-03-07 Gabriel Grand , Idan Asher Blank , Francisco Pereira , Evelina Fedorenko

Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge

Distributional models provide a convenient way to model semantics using dense embedding spaces derived from unsupervised learning algorithms. However, the dimensions of dense embedding spaces are not designed to resemble human semantic…

Computation and Language · Computer Science 2018-11-15 Steven Derby , Paul Miller , Brian Murphy , Barry Devereux

A Top-down Graph-based Tool for Modeling Classical Semantic Maps: A Crosslinguistic Case Study of Supplementary Adverbs

Semantic map models (SMMs) construct a network-like conceptual space from cross-linguistic instances or forms, based on the connectivity hypothesis. This approach has been widely used to represent similarity and entailment relationships in…

Computation and Language · Computer Science 2025-04-01 Zhu Liu , Cunliang Kong , Ying Liu , Maosong Sun

Joint Learning of Distributed Representations for Images and Texts

This technical report provides extra details of the deep multimodal similarity model (DMSM) which was proposed in (Fang et al. 2015, arXiv:1411.4952). The model is trained via maximizing global semantic similarity between images and their…

Computer Vision and Pattern Recognition · Computer Science 2015-04-29 Xiaodong He , Rupesh Srivastava , Jianfeng Gao , Li Deng

LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond

Distributional semantics based on neural approaches is a cornerstone of Natural Language Processing, with surprising connections to human meaning representation as well. Recent Transformer-based Language Models have proven capable of…

Computation and Language · Computer Science 2022-04-04 Daniel Loureiro , Alípio Mário Jorge , Jose Camacho-Collados

Contextually Propagated Term Weights for Document Representation

Word embeddings predict a word from its neighbours by learning small, dense embedding vectors. In practice, this prediction corresponds to a semantic score given to the predicted word (or term weight). We present a novel model that, given a…

Information Retrieval · Computer Science 2019-06-04 Casper Hansen , Christian Hansen , Stephen Alstrup , Jakob Grue Simonsen , Christina Lioma

Low-dimensional Semantic Space: from Text to Word Embedding

This article focuses on the study of Word Embedding, a feature-learning technique in Natural Language Processing that maps words or phrases to low-dimensional vectors. Beginning with the linguistic theories concerning contextual…

Computation and Language · Computer Science 2019-11-05 Xiaolei Lu , Bin Ni

Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models

Hallucinations in large language models (LLMs) produce fluent continuations that are not supported by the prompt, especially under minimal contextual cues and ambiguity. We introduce Distributional Semantics Tracing (DST), a model-native…

Computation and Language · Computer Science 2026-03-17 Gagan Bhatia , Somayajulu G Sripada , Kevin Allan , Jacobo Azcona

Tag Map: A Text-Based Map for Spatial Reasoning and Navigation with Large Language Models

Large Language Models (LLM) have emerged as a tool for robots to generate task plans using common sense reasoning. For the LLM to generate actionable plans, scene context must be provided, often through a map. Recent works have shifted from…

Robotics · Computer Science 2024-09-25 Mike Zhang , Kaixian Qu , Vaishakh Patil , Cesar Cadena , Marco Hutter

Distributed Representations for Compositional Semantics

The mathematical representation of semantics is a key issue for Natural Language Processing (NLP). A lot of research has been devoted to finding ways of representing the semantics of individual words in vector spaces. Distributional…

Computation and Language · Computer Science 2014-11-13 Karl Moritz Hermann