Related papers: High-dimensional distributed semantic spaces for u…
The mathematical representation of semantics is a key issue for Natural Language Processing (NLP). A lot of research has been devoted to finding ways of representing the semantics of individual words in vector spaces. Distributional…
Modern neural networks (NNs), trained on extensive raw sentence data, construct distributed representations by compressing individual words into dense, continuous, high-dimensional vectors. These representations are expected to capture…
Distributed linguistic representations are powerful tools for modelling the uncertainty and complexity of preference information in linguistic decision making. To provide a comprehensive perspective on the development of distributed…
High-dimensional representations for words, text, images, knowledge graphs and other structured data are commonly used in different paradigms of machine learning and data mining. These representations have different degrees of…
Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not…
We present a novel technique for learning semantic representations, which extends the distributional hypothesis to multilingual data and joint-space embeddings. Our models leverage parallel data and learn to strongly align the embeddings of…
Distributional semantics provides multi-dimensional, graded, empirically induced word representations that successfully capture many aspects of meaning in natural languages, as shown in a large body of work in computational linguistics;…
In the era of intelligent computing, computational progress in text processing is an essential consideration. Many systems have been developed to process text over different languages. Though, there is considerable development, they still…
Distributional models provide a convenient way to model semantics using dense embedding spaces derived from unsupervised learning algorithms. However, the dimensions of dense embedding spaces are not designed to resemble human semantic…
Neural language models are a powerful tool to embed words into semantic vector spaces. However, learning such models generally relies on the availability of abundant and diverse training examples. In highly specialised domains this…
This article focuses on the study of Word Embedding, a feature-learning technique in Natural Language Processing that maps words or phrases to low-dimensional vectors. Beginning with the linguistic theories concerning contextual…
Token representations in high-dimensional latent spaces often exhibit redundancy, limiting computational efficiency and reducing structural coherence across model layers. Hierarchical latent space folding introduces a structured…
In this paper we propose a general framework for learning distributed representations of attributes: characteristics of text whose representations can be jointly learned with word embeddings. Attributes can correspond to document indicators…
In the present paper we show that distributional information is particularly important when considering concept availability under implicit language learning conditions. Based on results from different behavioural experiments we argue that…
A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on large amounts of text in an unsupervised manner. These representations are typically used as general…
Recent works on word representations mostly rely on predictive models. Distributed word representations (aka word embeddings) are trained to optimally predict the contexts in which the corresponding words tend to appear. Such models have…
Discrete and continuous representations of content (e.g., of language or images) have interesting properties to be explored for the understanding of or reasoning with this content by machines. This position paper puts forward our opinion on…
There are two main approaches to the distributed representation of words: low-dimensional deep learning embeddings and high-dimensional distributional models, in which each dimension corresponds to a context word. In this paper, we combine…
This paper presents a deep learning architecture for the semantic decoder component of a Statistical Spoken Dialogue System. In a slot-filling dialogue, the semantic decoder predicts the dialogue act and a set of slot-value pairs from a set…
The representation space of pretrained Language Models (LMs) encodes rich information about words and their relationships (e.g., similarity, hypernymy, polysemy) as well as abstract semantic notions (e.g., intensity). In this paper, we…