Related papers: Dis-S2V: Discourse Informed Sen2Vec

Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech

In this paper, we propose a novel deep neural network architecture, Speech2Vec, for learning fixed-length vector representations of audio segments excised from a speech corpus, where the vectors contain semantic information pertaining to…

Computation and Language · Computer Science 2018-06-12 Yu-An Chung , James Glass

SaC2Vec: Information Network Representation with Structure and Content

Network representation learning (also known as information network embedding) has been the central piece of research in social and information network analysis for the last couple of years. An information network can be viewed as a linked…

Social and Information Networks · Computer Science 2018-07-05 Sambaran Bandyopadhyay , Harsh Kara , Anirban Biswas , M N Murty

DisSent: Sentence Representation Learning from Explicit Discourse Relations

Learning effective representations of sentences is one of the core missions of natural language understanding. Existing models either train on a vast amount of text, or require costly, manually curated sentence relation datasets. We show…

Computation and Language · Computer Science 2019-06-05 Allen Nie , Erin D. Bennett , Noah D. Goodman

Learning Generic Sentence Representations Using Convolutional Neural Networks

We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes. The model is learned by using a convolutional neural network as an encoder to map an input sentence into a…

Computation and Language · Computer Science 2017-07-28 Zhe Gan , Yunchen Pu , Ricardo Henao , Chunyuan Li , Xiaodong He , Lawrence Carin

Sen2Pro: A Probabilistic Perspective to Sentence Embedding from Pre-trained Language Model

Sentence embedding is one of the most fundamental tasks in Natural Language Processing and plays an important role in various tasks. The recent breakthrough in sentence embedding is achieved by pre-trained language models (PLMs). Despite…

Computation and Language · Computer Science 2023-06-06 Lingfeng Shen , Haiyun Jiang , Lemao Liu , Shuming Shi

Can Network Embedding of Distributional Thesaurus be Combined with Word Vectors for Better Representation?

Distributed representations of words learned from text have proved to be successful in various natural language processing tasks in recent times. While some methods represent words as vectors computed from text using predictive model…

Computation and Language · Computer Science 2018-02-20 Abhik Jana , Pawan Goyal

Learning Word Embeddings from Speech

In this paper, we propose a novel deep neural network architecture, Sequence-to-Sequence Audio2Vec, for unsupervised learning of fixed-length vector representations of audio segments excised from a speech corpus, where the vectors contain…

Computation and Language · Computer Science 2017-11-07 Yu-An Chung , James Glass

Vector representations of text data in deep learning

In this dissertation we report results of our research on dense distributed representations of text data. We propose two novel neural models for learning such representations. The first model learns representations at the document level,…

Computation and Language · Computer Science 2019-01-08 Karol Grzegorczyk

A Bilingual Generative Transformer for Semantic Sentence Embedding

Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences. Bilingual data offers a useful signal for learning such…

Computation and Language · Computer Science 2020-11-20 John Wieting , Graham Neubig , Taylor Berg-Kirkpatrick

Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection

While Word2Vec represents words (in text) as vectors carrying semantic information, audio Word2Vec was shown to be able to represent signal segments of spoken words as vectors carrying phonetic structure information. Audio Word2Vec can be…

Computation and Language · Computer Science 2018-08-08 Yu-Hsuan Wang , Hung-yi Lee , Lin-shan Lee

Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding

This paper presents a deep learning architecture for the semantic decoder component of a Statistical Spoken Dialogue System. In a slot-filling dialogue, the semantic decoder predicts the dialogue act and a set of slot-value pairs from a set…

Artificial Intelligence · Computer Science 2016-10-14 Lina M. Rojas Barahona , Milica Gasic , Nikola Mrkšić , Pei-Hao Su , Stefan Ultes , Tsung-Hsien Wen , Steve Young

A Latent Variable Recurrent Neural Network for Discourse Relation Language Models

This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences. A recurrent neural network generates individual…

Computation and Language · Computer Science 2016-04-06 Yangfeng Ji , Gholamreza Haffari , Jacob Eisenstein

An efficient framework for learning sentence representations

In this work we propose a simple and efficient framework for learning sentence representations from unlabelled data. Drawing inspiration from the distributional hypothesis and recent work on learning sentence representations, we reformulate…

Computation and Language · Computer Science 2018-03-09 Lajanugen Logeswaran , Honglak Lee

Learning Latent Vector Spaces for Product Search

We introduce a novel latent vector space model that jointly learns the latent representations of words, e-commerce products and a mapping between the two without the need for explicit annotations. The power of the model lies in its ability…

Information Retrieval · Computer Science 2016-08-26 Christophe Van Gysel , Maarten de Rijke , Evangelos Kanoulas

KeyVec: Key-semantics Preserving Document Representations

Previous studies have demonstrated the empirical success of word embeddings in various applications. In this paper, we investigate the problem of learning distributed representations for text documents which many machine learning algorithms…

Computation and Language · Computer Science 2017-09-29 Bin Bi , Hao Ma

Network-Efficient Distributed Word2vec Training System for Large Vocabularies

Word2vec is a popular family of algorithms for unsupervised training of dense vector representations of words on large text corpuses. The resulting vectors have been shown to capture semantic relationships among their corresponding words,…

Computation and Language · Computer Science 2016-06-29 Erik Ordentlich , Lee Yang , Andy Feng , Peter Cnudde , Mihajlo Grbovic , Nemanja Djuric , Vladan Radosavljevic , Gavin Owens

Topic2Vec: Learning Distributed Representations of Topics

Latent Dirichlet Allocation (LDA) mining thematic structure of documents plays an important role in nature language processing and machine learning areas. However, the probability distribution from LDA only describes the statistical…

Computation and Language · Computer Science 2015-06-30 Li-Qiang Niu , Xin-Yu Dai

Research on Optimization of Natural Language Processing Model Based on Multimodal Deep Learning

This project intends to study the image representation based on attention mechanism and multimodal data. By adding multiple pattern layers to the attribute model, the semantic and hidden layers of image content are integrated. The word…

Computation and Language · Computer Science 2024-06-14 Dan Sun , Yaxin Liang , Yining Yang , Yuhan Ma , Qishi Zhan , Erdi Gao

On the Effects of Using word2vec Representations in Neural Networks for Dialogue Act Recognition

Dialogue act recognition is an important component of a large number of natural language processing pipelines. Many research works have been carried out in this area, but relatively few investigate deep neural networks and word embeddings.…

Computation and Language · Computer Science 2020-10-23 Christophe Cerisara , Pavel Kral , Ladislav Lenc

SeVeN: Augmenting Word Embeddings with Unsupervised Relation Vectors

We present SeVeN (Semantic Vector Networks), a hybrid resource that encodes relationships between words in the form of a graph. Different from traditional semantic networks, these relations are represented as vectors in a continuous vector…

Computation and Language · Computer Science 2018-08-21 Luis Espinosa-Anke , Steven Schockaert