English
Related papers

Related papers: Unsupervised Contextualized Document Representatio…

200 papers

Efficient representation of text documents is an important building block in many NLP tasks. Research on long text categorization has shown that simple weighted averaging of word vectors for sentence representation often outperforms more…

Computation and Language · Computer Science 2019-11-20 Vivek Gupta , Ankit Saw , Pegah Nokhiz , Harshit Gupta , Partha Talukdar

We present a feature vector formation technique for documents - Sparse Composite Document Vector (SCDV) - which overcomes several shortcomings of the current distributional paragraph vector representations that are widely used for text…

Computation and Language · Computer Science 2017-05-15 Dheeraj Mekala , Vivek Gupta , Bhargavi Paranjape , Harish Karnick

Contextualized word embeddings (CWE) such as provided by ELMo (Peters et al., 2018), Flair NLP (Akbik et al., 2018), or BERT (Devlin et al., 2019) are a major recent innovation in NLP. CWEs provide semantic vector representations of words…

Computation and Language · Computer Science 2019-10-02 Gregor Wiedemann , Steffen Remus , Avi Chawla , Chris Biemann

Dense vector representations for textual data are crucial in modern NLP. Word embeddings and sentence embeddings estimated from raw texts are key in achieving state-of-the-art results in various tasks requiring semantic understanding.…

Computation and Language · Computer Science 2023-07-06 Sonal Sannigrahi , Josef van Genabith , Cristina Espana-Bonet

BERT is inefficient for sentence-pair tasks such as clustering or semantic search as it needs to evaluate combinatorially many sentence pairs which is very time-consuming. Sentence BERT (SBERT) attempted to solve this challenge by learning…

Computation and Language · Computer Science 2021-02-08 Yan Zhang , Ruidan He , Zuozhu Liu , Kwan Hui Lim , Lidong Bing

Tremendous amounts of multimedia associated with speech information are driving an urgent need to develop efficient and effective automatic summarization methods. To this end, we have seen rapid progress in applying supervised deep neural…

Computation and Language · Computer Science 2020-06-03 Shi-Yan Weng , Tien-Hong Lo , Berlin Chen

Contextualized word representations, such as ELMo and BERT, were shown to perform well on various semantic and syntactic tasks. In this work, we tackle the task of unsupervised disentanglement between semantics and structure in neural…

Computation and Language · Computer Science 2021-03-15 Shauli Ravfogel , Yanai Elazar , Jacob Goldberger , Yoav Goldberg

Contextualized embeddings such as BERT can serve as strong input representations to NLP tasks, outperforming their static embeddings counterparts such as skip-gram, CBOW and GloVe. However, such embeddings are dynamic, calculated according…

Computation and Language · Computer Science 2020-04-07 Yile Wang , Leyang Cui , Yue Zhang

Most unsupervised NLP models represent each word with a single point or single region in semantic space, while the existing multi-sense word embeddings cannot represent longer word sequences like phrases or sentences. We propose a novel…

Computation and Language · Computer Science 2021-12-30 Haw-Shiuan Chang , Amol Agrawal , Andrew McCallum

We experiment with two recent contextualized word embedding methods (ELMo and BERT) in the context of open-domain argument search. For the first time, we show how to leverage the power of contextualized word embeddings to classify and…

Computation and Language · Computer Science 2019-06-25 Nils Reimers , Benjamin Schiller , Tilman Beck , Johannes Daxenberger , Christian Stab , Iryna Gurevych

Sentence embeddings encode sentences in fixed dense vectors and have played an important role in various NLP tasks and systems. Methods for building sentence embeddings include unsupervised learning such as Quick-Thoughts and supervised…

Computation and Language · Computer Science 2021-06-10 Danqi Liao

Learning high-quality sentence representations benefits a wide range of natural language processing tasks. Though BERT-based pre-trained language models achieve high performance on many downstream tasks, the native derived sentence…

Computation and Language · Computer Science 2021-05-26 Yuanmeng Yan , Rumei Li , Sirui Wang , Fuzheng Zhang , Wei Wu , Weiran Xu

Recent work incorporates pre-trained word embeddings such as BERT embeddings into Neural Topic Models (NTMs), generating highly coherent topics. However, with high-quality contextualized document representations, do we really need…

Computation and Language · Computer Science 2022-04-22 Zihan Zhang , Meng Fang , Ling Chen , Mohammad-Reza Namazi-Rad

While supervised learning models have shown remarkable performance in various natural language processing (NLP) tasks, their success heavily relies on the availability of large-scale labeled datasets, which can be costly and time-consuming…

Computation and Language · Computer Science 2024-06-04 Wrick Talukdar , Anjanava Biswas

In recent years BERT shows apparent advantages and great potential in natural language processing tasks. However, both training and applying BERT requires intensive time and resources for computing contextual language representations, which…

Computation and Language · Computer Science 2021-11-05 Tan Huang

Distributed representations of words, better known as word embeddings, have become important building blocks for natural language processing tasks. Numerous studies are devoted to transferring the success of unsupervised word embeddings to…

Computation and Language · Computer Science 2018-11-28 Tianlin Liu , João Sedoc , Lyle Ungar

Sentence embedding is an important research topic in natural language processing (NLP) since it can transfer knowledge to downstream tasks. Meanwhile, a contextualized word representation, called BERT, achieves the state-of-the-art…

Computation and Language · Computer Science 2020-06-02 Bin Wang , C. -C. Jay Kuo

Neural networks provide new possibilities to automatically learn complex language patterns and query-document relations. Neural IR models have achieved promising results in learning query-document relevance patterns, but few explorations…

Information Retrieval · Computer Science 2019-05-23 Zhuyun Dai , Jamie Callan

In this paper, we propose a novel approach for text classification based on clustering word embeddings, inspired by the bag of visual words model, which is widely used in computer vision. After each word in a collection of documents is…

Computation and Language · Computer Science 2017-07-26 Andrei M. Butnaru , Radu Tudor Ionescu

The recently proposed BERT has shown great power on a variety of natural language understanding tasks, such as text classification, reading comprehension, etc. However, how to effectively apply BERT to neural machine translation (NMT) lacks…

Computation and Language · Computer Science 2020-02-18 Jinhua Zhu , Yingce Xia , Lijun Wu , Di He , Tao Qin , Wengang Zhou , Houqiang Li , Tie-Yan Liu
‹ Prev 1 2 3 10 Next ›