Related papers: Kernelized Bayesian Softmax for Text Generation

Embedding Words and Senses Together via Joint Knowledge-Enhanced Training

Word embeddings are widely used in Natural Language Processing, mainly due to their success in capturing semantic information from massive corpora. However, their creation process does not allow the different meanings of a word to be…

Computation and Language · Computer Science 2017-06-22 Massimiliano Mancini , Jose Camacho-Collados , Ignacio Iacobacci , Roberto Navigli

Inducing and Embedding Senses with Scaled Gumbel Softmax

Methods for learning word sense embeddings represent a single word with multiple sense-specific vectors. These methods should not only produce interpretable sense embeddings, but should also learn how to select which sense to use in a given…

Computation and Language · Computer Science 2019-12-17 Fenfei Guo , Mohit Iyyer , Jordan Boyd-Graber

Multi-sense Definition Modeling using Word Sense Decompositions

Word embeddings capture syntactic and semantic information about words. Definition modeling aims to make the semantic content in each embedding explicit, by outputting a natural language definition based on the embedding. However, existing…

Computation and Language · Computer Science 2019-09-23 Ruimin Zhu , Thanapon Noraset , Alisa Liu , Wenxin Jiang , Doug Downey

Neural Embeddings for Text

We propose a new kind of embedding for natural language text that deeply represents semantic meaning. Standard text embeddings use the outputs from hidden layers of a pretrained language model. In our method, we let a language model learn…

Computation and Language · Computer Science 2022-11-22 Oleg Vasilyev , John Bohannon

Towards Efficiently Diversifying Dialogue Generation via Embedding Augmentation

Dialogue generation models face the challenge of producing generic and repetitive responses. Unlike previous augmentation methods that mostly focus on token manipulation and ignore the essential variety within a single sample using hard…

Computation and Language · Computer Science 2021-03-03 Yu Cao , Liang Ding , Zhiliang Tian , Meng Fang

Improved Semantic-Aware Network Embedding with Fine-Grained Word Alignment

Network embeddings, which learn low-dimensional representations for each vertex in a large-scale network, have received considerable attention in recent years. For a wide range of applications, vertices in a network are typically…

Computation and Language · Computer Science 2018-08-30 Dinghan Shen , Xinyuan Zhang , Ricardo Henao , Lawrence Carin

F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax

Despite recent advances in neural text generation, encoding the rich diversity in human language remains elusive. We argue that the sub-optimal text generation is mainly attributable to the imbalanced token distribution, which particularly…

Computation and Language · Computer Science 2020-10-06 Byung-Ju Choi , Jimin Hong , David Keetae Park , Sang Wan Lee

SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word Models

Sentence embedding is an important research topic in natural language processing (NLP) since it can transfer knowledge to downstream tasks. Meanwhile, a contextualized word representation, called BERT, achieves the state-of-the-art…

Computation and Language · Computer Science 2020-06-02 Bin Wang , C. -C. Jay Kuo

Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research

Conventional text classification models make a bag-of-words assumption reducing text into word occurrence counts per document. Recent algorithms such as word2vec are capable of learning semantic meaning and similarity between words in an…

Computation and Language · Computer Science 2018-07-11 Vincent Major , Alisa Surkis , Yindalon Aphinyanaphongs

Exploring Kernel Functions in the Softmax Layer for Contextual Word Classification

Prominently used in support vector machines and logistic regressions, kernel functions (kernels) can implicitly map data points into high dimensional spaces and make it easier to learn complex decision boundaries. In this work, by replacing…

Computation and Language · Computer Science 2019-10-29 Yingbo Gao , Christian Herold , Weiyue Wang , Hermann Ney

A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings

We propose a novel generative model to explore both local and global context for joint learning topics and topic-specific word embeddings. In particular, we assume that global latent topics are shared across documents, a word is generated…

Computation and Language · Computer Science 2020-08-12 Lixing Zhu , Yulan He , Deyu Zhou

Text Simplification with Sentence Embeddings

Sentence embeddings can be decoded to give approximations of the original texts used to create them. We explore this effect in the context of text simplification, demonstrating that reconstructed text embeddings preserve complexity levels.…

Computation and Language · Computer Science 2025-10-29 Matthew Shardlow

Attention with Trained Embeddings Provably Selects Important Tokens

Token embeddings play a crucial role in language modeling but, despite this practical relevance, their theoretical understanding remains limited. Our paper addresses the gap by characterizing the structure of embeddings obtained via…

Machine Learning · Computer Science 2025-06-26 Diyuan Wu , Aleksandr Shevchenko , Samet Oymak , Marco Mondelli

Lexical Complexity Controlled Sentence Generation

Text generation rarely considers the control of lexical complexity, which limits its more comprehensive practical application. We introduce a novel task of lexical complexity controlled sentence generation, which aims at keywords to…

Computation and Language · Computer Science 2022-11-29 Jinran Nie , Liner Yang , Yun Chen , Cunliang Kong , Junhui Zhu , Erhong Yang

Learning Sentence Embeddings for Coherence Modelling and Beyond

We present a novel and effective technique for performing text coherence tasks while facilitating deeper insights into the data. Despite obtaining ever-increasing task performance, modern deep-learning approaches to NLP tasks often only…

Computation and Language · Computer Science 2019-08-09 Tanner Bohn , Yining Hu , Jinhang Zhang , Charles X. Ling

Improve Lexicon-based Word Embeddings By Word Sense Disambiguation

There have been some works that learn a lexicon together with the corpus to improve the word embeddings. However, they either model the lexicon separately but update the neural networks for both the corpus and the lexicon by the same…

Computation and Language · Computer Science 2017-07-25 Yuanzhi Ke , Masafumi Hagiwara

SentBS: Sentence-level Beam Search for Controllable Summarization

A wide range of control perspectives have been explored in controllable text generation. Structure-controlled summarization is recently proposed as a useful and interesting research direction. However, current structure-controlling methods…

Computation and Language · Computer Science 2023-02-27 Chenhui Shen , Liying Cheng , Lidong Bing , Yang You , Luo Si

Neural Text Generation: A Practical Guide

Deep learning methods have recently achieved great empirical success on machine translation, dialogue response generation, summarization, and other text generation tasks. At a high level, the technique has been to train end-to-end neural…

Computation and Language · Computer Science 2017-11-28 Ziang Xie

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context

Word embeddings, which represent a word as a point in a vector space, have become ubiquitous to several NLP tasks. A recent line of work uses bilingual (two languages) corpora to learn a different vector for each sense of a word, by…

Computation and Language · Computer Science 2017-06-27 Shyam Upadhyay , Kai-Wei Chang , Matt Taddy , Adam Kalai , James Zou

Word Sense Induction with Knowledge Distillation from BERT

Pre-trained contextual language models are ubiquitously employed for language understanding tasks, but are unsuitable for resource-constrained systems. Noncontextual word embeddings are an efficient alternative in these settings. Such…

Computation and Language · Computer Science 2023-04-24 Anik Saha , Alex Gittens , Bulent Yener