English
Related papers

Related papers: Are Classes Clusters?

200 papers

Sentence embedding methods offer a powerful approach for working with short textual constructs or sequences of words. By representing sentences as dense numerical vectors, many natural language processing (NLP) applications have improved…

Computation and Language · Computer Science 2021-10-05 Yuan An , Alexander Kalinowski , Jane Greenberg

Recent work incorporates pre-trained word embeddings such as BERT embeddings into Neural Topic Models (NTMs), generating highly coherent topics. However, with high-quality contextualized document representations, do we really need…

Computation and Language · Computer Science 2022-04-22 Zihan Zhang , Meng Fang , Ling Chen , Mohammad-Reza Namazi-Rad

We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate performance on diverse transfer tasks. Two variants of the…

Many of the kinds of language model used in speech understanding suffer from imperfect modeling of intra-sentential contextual influences. I argue that this problem can be addressed by clustering the sentences in a training corpus…

cmp-lg · Computer Science 2008-02-03 David Carter

Sentence embeddings are an important component of many natural language processing (NLP) systems. Like word embeddings, sentence embeddings are typically learned on large text corpora and then transferred to various downstream tasks, such…

Computation and Language · Computer Science 2021-05-28 John Giorgi , Osvald Nitski , Bo Wang , Gary Bader

Topic models are a useful analysis tool to uncover the underlying themes within document collections. The dominant approach is to use probabilistic topic models that posit a generative story, but in this paper we propose an alternative way…

Computation and Language · Computer Science 2020-10-08 Suzanna Sia , Ayush Dalmia , Sabrina J. Mielke

Large Language Models (LLMs) have recently shown remarkable advancement in various NLP tasks. As such, a popular trend has emerged lately where NLP researchers extract word/sentence/document embeddings from these large decoder-only models…

Computation and Language · Computer Science 2025-03-04 Yash Mahajan , Matthew Freestone , Sathyanarayanan Aakur , Santu Karmaker

Large Language Models (LLMs) have recently shown remarkable advancement in various NLP tasks. As such, a popular trend has emerged lately where NLP researchers extract word/sentence/document embeddings from these large decoder-only models…

Computation and Language · Computer Science 2025-03-04 Yash Mahajan , Matthew Freestone , Naman Bansal , Sathyanarayanan Aakur , Shubhra Kanti Karmaker Santu

Scientific articles are long text documents organized into sections, each describing aspects of the research. Analyzing scientific production has become progressively challenging due to the increase in the number of available articles.…

Computation and Language · Computer Science 2024-04-02 Gustavo Bartz Guedes , Ana Estela Antunes da Silva

Text embeddings are numerical representations of text data, where words, phrases, or entire documents are converted into vectors of real numbers. These embeddings capture semantic meanings and relationships between text elements in a…

Information Retrieval · Computer Science 2025-01-20 Fusheng Wei , Robert Neary , Han Qin , Qiang Mao , Jianping Zhang

We present a clustering-based language model using word embeddings for text readability prediction. Presumably, an Euclidean semantic space hypothesis holds true for word embeddings whose training is done by observing word co-occurrences.…

Computation and Language · Computer Science 2017-09-07 Miriam Cha , Youngjune Gwon , H. T. Kung

Most unsupervised NLP models represent each word with a single point or single region in semantic space, while the existing multi-sense word embeddings cannot represent longer word sequences like phrases or sentences. We propose a novel…

Computation and Language · Computer Science 2021-12-30 Haw-Shiuan Chang , Amol Agrawal , Andrew McCallum

Sentence embedding techniques aim to encode key concepts of a sentence's meaning in a vector space. However, the majority of evaluation approaches for sentence embedding quality rely on the use of additional classifiers or downstream tasks.…

Computation and Language · Computer Science 2026-04-24 Paul Keuren , Marc Ponsen , Robert Ayoub Bagheri

We provide the first exploration of sentence embeddings from text-to-text transformers (T5). Sentence embeddings are broadly useful for language processing tasks. While T5 achieves impressive performance on language tasks cast as…

Computation and Language · Computer Science 2021-12-15 Jianmo Ni , Gustavo Hernández Ábrego , Noah Constant , Ji Ma , Keith B. Hall , Daniel Cer , Yinfei Yang

There have been many successful applications of sentence embedding methods. However, it has not been well understood what properties are captured in the resulting sentence embeddings depending on the supervision signals. In this paper, we…

Computation and Language · Computer Science 2022-06-13 Hayato Tsukagoshi , Ryohei Sasano , Koichi Takeda

Discovering the logical sequence of events is one of the cornerstones in Natural Language Understanding. One approach to learn the sequence of events is to study the order of sentences in a coherent text. Sentence ordering can be applied in…

Computation and Language · Computer Science 2021-08-26 Melika Golestani , Seyedeh Zahra Razavi , Zeinab Borhanifard , Farnaz Tahmasebian , Hesham Faili

This paper have two parts. In the first part we discuss word embeddings. We discuss the need for them, some of the methods to create them, and some of their interesting properties. We also compare them to image embeddings and see how word…

Machine Learning · Computer Science 2016-10-27 Amit Mandelbaum , Adi Shalev

Automatic text classification (TC) research can be used for real-world problems such as the classification of in-patient discharge summaries and medical text reports, which is beneficial to make medical documents more understandable to…

Computation and Language · Computer Science 2018-12-06 Ying Shen , Qiang Zhang , Jin Zhang , Jiyue Huang , Yuming Lu , Kai Lei

Various NLP problems -- such as the prediction of sentence similarity, entailment, and discourse relations -- are all instances of the same general task: the modeling of semantic relations between a pair of textual elements. A popular model…

Computation and Language · Computer Science 2019-04-05 Damien Sileo , Tim Van-De-Cruys , Camille Pradel , Philippe Muller

We experiment with two recent contextualized word embedding methods (ELMo and BERT) in the context of open-domain argument search. For the first time, we show how to leverage the power of contextualized word embeddings to classify and…

Computation and Language · Computer Science 2019-06-25 Nils Reimers , Benjamin Schiller , Tilman Beck , Johannes Daxenberger , Christian Stab , Iryna Gurevych
‹ Prev 1 2 3 10 Next ›