Related papers: Spherical Text Embedding

Generalised Spherical Text Embedding

This paper aims to provide an unsupervised modelling approach that allows for a more flexible representation of text embeddings. It jointly encodes the words and the paragraphs as individual matrices of arbitrary column dimension with unit…

Computation and Language · Computer Science 2022-12-01 Souvik Banerjee , Bamdev Mishra , Pratik Jawanpuria , Manish Shrivastava

Learning Sentence Embeddings for Coherence Modelling and Beyond

We present a novel and effective technique for performing text coherence tasks while facilitating deeper insights into the data. Despite obtaining ever-increasing task performance, modern deep-learning approaches to NLP tasks often only…

Computation and Language · Computer Science 2019-08-09 Tanner Bohn , Yining Hu , Jinhang Zhang , Charles X. Ling

Synergizing Unsupervised and Supervised Learning: A Hybrid Approach for Accurate Natural Language Task Modeling

While supervised learning models have shown remarkable performance in various natural language processing (NLP) tasks, their success heavily relies on the availability of large-scale labeled datasets, which can be costly and time-consuming…

Computation and Language · Computer Science 2024-06-04 Wrick Talukdar , Anjanava Biswas

Extending Multi-Sense Word Embedding to Phrases and Sentences for Unsupervised Semantic Applications

Most unsupervised NLP models represent each word with a single point or single region in semantic space, while the existing multi-sense word embeddings cannot represent longer word sequences like phrases or sentences. We propose a novel…

Computation and Language · Computer Science 2021-12-30 Haw-Shiuan Chang , Amol Agrawal , Andrew McCallum

Asynchronous Training of Word Embeddings for Large Text Corpora

Word embeddings are a powerful approach for analyzing language and have been widely popular in numerous tasks in information retrieval and text mining. Training embeddings over huge corpora is computationally expensive because the input is…

Machine Learning · Computer Science 2018-12-11 Avishek Anand , Megha Khosla , Jaspreet Singh , Jan-Hendrik Zab , Zijian Zhang

A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings

Learning word embeddings has received a significant amount of attention recently. Often, word embeddings are learned in an unsupervised manner from a large collection of text. The genre of the text typically plays an important role in the…

Computation and Language · Computer Science 2019-02-04 Wei Yang , Wei Lu , Vincent W. Zheng

Supervised Understanding of Word Embeddings

Pre-trained word embeddings are widely used for transfer learning in natural language processing. The embeddings are continuous and distributed representations of the words that preserve their similarities in compact Euclidean spaces.…

Computation and Language · Computer Science 2020-06-25 Halid Ziya Yerebakan , Parmeet Bhatia , Yoshihisa Shinagawa

Unsupervised Attention-based Sentence-Level Meta-Embeddings from Contextualised Language Models

A variety of contextualised language models have been proposed in the NLP community, which are trained on diverse corpora to produce numerous Neural Language Models (NLMs). However, different NLMs have reported different levels of…

Computation and Language · Computer Science 2022-04-19 Keigo Takahashi , Danushka Bollegala

Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features

The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We…

Computation and Language · Computer Science 2018-12-31 Matteo Pagliardini , Prakhar Gupta , Martin Jaggi

The Role of Context Types and Dimensionality in Learning Word Embeddings

We provide the first extensive evaluation of how using different types of context to learn skip-gram word embeddings affects performance on a wide range of intrinsic and extrinsic NLP tasks. Our results suggest that while intrinsic tasks…

Computation and Language · Computer Science 2017-07-20 Oren Melamud , David McClosky , Siddharth Patwardhan , Mohit Bansal

Unsupervised learning of text line segmentation by differentiating coarse patterns

Despite recent advances in the field of supervised deep learning for text line segmentation, unsupervised deep learning solutions are beginning to gain popularity. In this paper, we present an unsupervised deep learning method that embeds…

Computer Vision and Pattern Recognition · Computer Science 2021-05-24 Berat Kurar Barakat , Ahmad Droby , Raid Saabni , Jihad El-Sana

Sentence transition matrix: An efficient approach that preserves sentence semantics

Sentence embedding is a significant research topic in the field of natural language processing (NLP). Generating sentence embedding vectors reflecting the intrinsic meaning of a sentence is a key factor to achieve an enhanced performance in…

Computation and Language · Computer Science 2019-01-17 Myeongjun Jang , Pilsung Kang

Learning Efficient Task-Specific Meta-Embeddings with Word Prisms

Word embeddings are trained to predict word cooccurrence statistics, which leads them to possess different lexical properties (syntactic, semantic, etc.) depending on the notion of context defined at training time. These properties manifest…

Computation and Language · Computer Science 2020-11-06 Jingyi He , KC Tsiolis , Kian Kenyon-Dean , Jackie Chi Kit Cheung

Unsupervised Cross-lingual Transfer of Word Embedding Spaces

Cross-lingual transfer of word embeddings aims to establish the semantic mappings among words in different languages by learning the transformation functions over the corresponding word embedding spaces. Successfully solving this problem…

Computation and Language · Computer Science 2018-09-12 Ruochen Xu , Yiming Yang , Naoki Otani , Yuexin Wu

Optimizing Sentence Embedding with Pseudo-Labeling and Model Ensembles: A Hierarchical Framework for Enhanced NLP Tasks

Sentence embedding tasks are important in natural language processing (NLP), but improving their performance while keeping them reliable is still hard. This paper presents a framework that combines pseudo-label generation and model ensemble…

Computation and Language · Computer Science 2025-01-28 Ziwei Liu , Qi Zhang , Lifu Gao

Supervised Contextual Embeddings for Transfer Learning in Natural Language Processing Tasks

Pre-trained word embeddings are the primary method for transfer learning in several Natural Language Processing (NLP) tasks. Recent works have focused on using unsupervised techniques such as language modeling to obtain these embeddings. In…

Computation and Language · Computer Science 2019-07-01 Mihir Kale , Aditya Siddhant , Sreyashi Nag , Radhika Parik , Matthias Grabmair , Anthony Tomasic

Learning Embeddings into Entropic Wasserstein Spaces

Euclidean embeddings of data are fundamentally limited in their ability to capture latent semantic structures, which need not conform to Euclidean spatial assumptions. Here we consider an alternative, which embeds data as discrete…

Machine Learning · Computer Science 2019-05-10 Charlie Frogner , Farzaneh Mirzazadeh , Justin Solomon

Interpretable Text Embeddings and Text Similarity Explanation: A Survey

Text embeddings are a fundamental component in many NLP tasks, including classification, regression, clustering, and semantic search. However, despite their ubiquitous application, challenges persist in interpreting embeddings and…

Computation and Language · Computer Science 2025-10-03 Juri Opitz , Lucas Möller , Andrianos Michail , Sebastian Padó , Simon Clematide

Language Modeling by Clustering with Word Embeddings for Text Readability Assessment

We present a clustering-based language model using word embeddings for text readability prediction. Presumably, an Euclidean semantic space hypothesis holds true for word embeddings whose training is done by observing word co-occurrences.…

Computation and Language · Computer Science 2017-09-07 Miriam Cha , Youngjune Gwon , H. T. Kung

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Learning semantically meaningful sentence embeddings is an open problem in natural language processing. In this work, we propose a sentence embedding learning approach that exploits both visual and textual information via a multimodal…

Computation and Language · Computer Science 2022-04-26 Miaoran Zhang , Marius Mosbach , David Ifeoluwa Adelani , Michael A. Hedderich , Dietrich Klakow