Related papers: Structured Embedding Models for Grouped Data

Exponential Family Embeddings

Word embeddings are a powerful approach for capturing semantic similarity among terms in a vocabulary. In this paper, we develop exponential family embeddings, a class of methods that extends the idea of word embeddings to other types of…

Machine Learning · Statistics 2016-11-22 Maja R. Rudolph , Francisco J. R. Ruiz , Stephan Mandt , David M. Blei

Dynamic Bernoulli Embeddings for Language Evolution

Word embeddings are a powerful approach for unsupervised analysis of language. Recently, Rudolph et al. (2016) developed exponential family embeddings, which cast word embeddings in a probabilistic framework. Here, we develop dynamic…

Machine Learning · Statistics 2017-03-24 Maja Rudolph , David Blei

Dialectograms: Machine Learning Differences between Discursive Communities

Word embeddings provide an unsupervised way to understand differences in word usage between discursive communities. A number of recent papers have focused on identifying words that are used differently by two or more communities. But word…

Computation and Language · Computer Science 2023-02-14 Thyge Enggaard , August Lohse , Morten Axel Pedersen , Sune Lehmann

Efficient Sentence Embedding via Semantic Subspace Analysis

A novel sentence embedding method built upon semantic subspace analysis, called semantic subspace sentence embedding (S3E), is proposed in this work. Given the fact that word embeddings can capture semantic relationship while semantically…

Computation and Language · Computer Science 2020-03-05 Bin Wang , Fenxiao Chen , Yuncheng Wang , C. -C. Jay Kuo

Composing and Embedding the Words-as-Classifiers Model of Grounded Semantics

The words-as-classifiers model of grounded lexical semantics learns a semantic fitness score between physical entities and the words that are used to denote those entities. In this paper, we explore how such a model can incrementally…

Computation and Language · Computer Science 2019-11-11 Daniele Moro , Stacy Black , Casey Kennington

Category Enhanced Word Embedding

Distributed word representations have been demonstrated to be effective in capturing semantic and syntactic regularities. Unsupervised representation learning from large unlabeled corpora can learn similar representations for those words…

Computation and Language · Computer Science 2015-12-01 Chunting Zhou , Chonglin Sun , Zhiyuan Liu , Francis C. M. Lau

Learning Meta-Embeddings by Using Ensembles of Embedding Sets

Word embeddings -- distributed representations of words -- in deep learning are beneficial for many tasks in natural language processing (NLP). However, different embedding sets vary greatly in quality and characteristics of the captured…

Computation and Language · Computer Science 2015-12-31 Wenpeng Yin , Hinrich Schütze

EEF: Exponentially Embedded Families with Class-Specific Features for Classification

In this letter, we present a novel exponentially embedded families (EEF) based classification method, in which the probability density function (PDF) on raw data is estimated from the PDF on features. With the PDF construction, we show that…

Machine Learning · Statistics 2016-08-24 Bo Tang , Steven Kay , Haibo He , Paul M. Baggenstoss

Compositional Demographic Word Embeddings

Word embeddings are usually derived from corpora containing text from many individuals, thus leading to general purpose representations rather than individually personalized representations. While personalized embeddings can be useful to…

Computation and Language · Computer Science 2020-11-22 Charles Welch , Jonathan K. Kummerfeld , Verónica Pérez-Rosas , Rada Mihalcea

Mixed Membership Word Embeddings for Computational Social Science

Word embeddings improve the performance of NLP systems by revealing the hidden structural relationships between words. Despite their success in many applications, word embeddings have seen very little use in computational social science NLP…

Computation and Language · Computer Science 2018-02-21 James Foulds

Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization

Word embedding methods revolve around learning continuous distributed vector representations of words with neural networks, which can capture semantic and/or syntactic cues, and in turn be used to induce similarity measures among words,…

Computation and Language · Computer Science 2016-07-25 Kuan-Yu Chen , Shih-Hung Liu , Berlin Chen , Hsin-Min Wang , Hsin-Hsi Chen

An Exploratory Study on Utilising the Web of Linked Data for Product Data Mining

The Linked Open Data practice has led to a significant growth of structured data on the Web in the last decade. Such structured data describe real-world entities in a machine-readable way, and have created an unprecedented opportunity for…

Computation and Language · Computer Science 2022-06-27 Ziqi Zhang , Xingyi Song

Exponential Family Attention

The self-attention mechanism is the backbone of the transformer neural network underlying most large language models. It can capture complex word patterns and long-range dependencies in natural language. This paper introduces exponential…

Machine Learning · Statistics 2025-01-29 Kevin Christian Wibisono , Yixin Wang

Bio-inspired Structure Identification in Language Embeddings

Word embeddings are a popular way to improve downstream performances in contemporary language modeling. However, the underlying geometric structure of the embedding space is not well understood. We present a series of explorations using…

Computation and Language · Computer Science 2020-09-17 Hongwei , Zhou , Oskar Elek , Pranav Anand , Angus G. Forbes

FRAGE: Frequency-Agnostic Word Representation

Continuous word representation (aka word embedding) is a basic building block in many neural network-based models used in natural language processing tasks. Although it is widely accepted that words with similar semantics should be close to…

Computation and Language · Computer Science 2020-03-18 Chengyue Gong , Di He , Xu Tan , Tao Qin , Liwei Wang , Tie-Yan Liu

A Primer on Word Embeddings: AI Techniques for Text Analysis in Social Work

Word embeddings represent a transformative technology for analyzing text data in social work research, offering sophisticated tools for understanding case notes, policy documents, research literature, and other text-based materials. This…

Computation and Language · Computer Science 2024-11-12 Brian E. Perron , Kelley A. Rivenburgh , Bryan G. Victor , Zia Qi , Hui Luan

Morphological Priors for Probabilistic Neural Word Embeddings

Word embeddings allow natural language processing systems to share statistical information across related words. These embeddings are typically based on distributional statistics, making it difficult for them to generalize to rare or unseen…

Computation and Language · Computer Science 2016-09-27 Parminder Bhatia , Robert Guthrie , Jacob Eisenstein

Compass-aligned Distributional Embeddings for Studying Semantic Differences across Corpora

Word2vec is one of the most used algorithms to generate word embeddings because of a good mix of efficiency, quality of the generated representations and cognitive grounding. However, word meaning is not static and depends on the context in…

Artificial Intelligence · Computer Science 2020-04-15 Federico Bianchi , Valerio Di Carlo , Paolo Nicoli , Matteo Palmonari

Structural Embedding of Syntactic Trees for Machine Comprehension

Deep neural networks for machine comprehension typically utilizes only word or character embeddings without explicitly taking advantage of structured linguistic information such as constituency trees and dependency trees. In this paper, we…

Computation and Language · Computer Science 2017-09-04 Rui Liu , Junjie Hu , Wei Wei , Zi Yang , Eric Nyberg

Analyzing autoencoder-based acoustic word embeddings

Recent studies have introduced methods for learning acoustic word embeddings (AWEs)---fixed-size vector representations of words which encode their acoustic features. Despite the widespread use of AWEs in speech processing research, they…

Computation and Language · Computer Science 2020-04-06 Yevgen Matusevych , Herman Kamper , Sharon Goldwater