English
Related papers

Related papers: Binary Paragraph Vectors

200 papers

In this dissertation we report results of our research on dense distributed representations of text data. We propose two novel neural models for learning such representations. The first model learns representations at the document level,…

Computation and Language · Computer Science 2019-01-08 Karol Grzegorczyk

Paragraph Vectors has been recently proposed as an unsupervised method for learning distributed representations for pieces of texts. In their work, the authors showed that the method can learn an embedding of movie review texts which can be…

Computation and Language · Computer Science 2015-07-30 Andrew M. Dai , Christopher Olah , Quoc V. Le

Word2vec (Mikolov et al., 2013) has proven to be successful in natural language processing by capturing the semantic relationships between different words. Built on top of single-word embeddings, paragraph vectors (Le and Mikolov, 2014)…

Computation and Language · Computer Science 2017-12-11 Geng Ji , Robert Bamler , Erik B. Sudderth , Stephan Mandt

Many machine learning algorithms require the input to be represented as a fixed-length feature vector. When it comes to texts, one of the most common fixed-length features is bag-of-words. Despite their popularity, bag-of-words features…

Computation and Language · Computer Science 2014-05-26 Quoc V. Le , Tomas Mikolov

Just as semantic hashing can accelerate information retrieval, binary valued embeddings can significantly reduce latency in the retrieval of graphical data. We introduce a simple but effective model for learning such binary vectors for…

Machine Learning · Computer Science 2018-03-28 Vinith Misra , Sumit Bhatia

Traditional network embedding primarily focuses on learning a continuous vector representation for each node, preserving network structure and/or node content information, such that off-the-shelf machine learning algorithms can be easily…

Social and Information Networks · Computer Science 2023-01-02 Daokun Zhang , Jie Yin , Xingquan Zhu , Chengqi Zhang

Despite the loss of semantic information, bag-of-ngram based methods still achieve state-of-the-art results for tasks such as sentiment classification of long movie reviews. Many document embeddings methods have been proposed to capture…

Computation and Language · Computer Science 2016-04-26 Bofang Li , Tao Liu , Xiaoyong Du , Deyuan Zhang , Zhe Zhao

Existing information retrieval systems are largely constrained by their reliance on vector inner products to assess query-document relevance, which naturally limits the expressiveness of the relevance score they can produce. We propose a…

Information Retrieval · Computer Science 2025-05-02 Julian Killingback , Hansi Zeng , Hamed Zamani

We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a…

Computation and Language · Computer Science 2016-06-29 Nemanja Djuric , Hao Wu , Vladan Radosavljevic , Mihajlo Grbovic , Narayan Bhamidipati

Binary concepts are empirically used by humans to generalize efficiently. And they are based on Bernoulli distribution which is the building block of information. These concepts span both low-level and high-level features such as "large vs…

Machine Learning · Computer Science 2023-03-23 Zizhao Hu , Mohammad Rostami

We present a neural model for representing snippets of code as continuous distributed vectors ("code embeddings"). The main idea is to represent a code snippet as a single fixed-length $\textit{code vector}$, which can be used to predict…

Machine Learning · Computer Science 2018-10-31 Uri Alon , Meital Zilberstein , Omer Levy , Eran Yahav

Dense retrieval models use bi-encoder network architectures for learning query and document representations. These representations are often in the form of a vector representation and their similarities are often computed using the dot…

Information Retrieval · Computer Science 2023-05-01 Hamed Zamani , Michael Bendersky

Recently, there have been some breakthroughs in graph analysis by applying the graph neural networks (GNNs) following a neighborhood aggregation scheme, which demonstrate outstanding performance in many tasks. However, we observe that the…

Machine Learning · Computer Science 2021-04-13 Hanchen Wang , Defu Lian , Ying Zhang , Lu Qin , Xiangjian He , Yiguang Lin , Xuemin Lin

Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the…

Computation and Language · Computer Science 2014-06-17 Misha Denil , Alban Demiraj , Nal Kalchbrenner , Phil Blunsom , Nando de Freitas

Word embeddings are commonly used as a starting point in many NLP models to achieve state-of-the-art performances. However, with a large vocabulary and many dimensions, these floating-point representations are expensive both in terms of…

Computation and Language · Computer Science 2020-01-23 Julien Tissier , Christophe Gravier , Amaury Habrard

We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes. The model is learned by using a convolutional neural network as an encoder to map an input sentence into a…

Computation and Language · Computer Science 2017-07-28 Zhe Gan , Yunchen Pu , Ricardo Henao , Chunyuan Li , Xiaodong He , Lawrence Carin

Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems. The learned representations are generally assumed to be continuous and real-valued,…

Computation and Language · Computer Science 2019-06-21 Dinghan Shen , Pengyu Cheng , Dhanasekar Sundararaman , Xinyuan Zhang , Qian Yang , Meng Tang , Asli Celikyilmaz , Lawrence Carin

Binary classification is one of the most common problem in machine learning. It consists in predicting whether a given element belongs to a particular class. In this paper, a new algorithm for binary classification is proposed using a…

Machine Learning · Computer Science 2019-03-12 Alexandre Quemy

Similarity metrics are a core component of many information retrieval and machine learning systems. In this work we propose a method capable of learning a similarity metric from data equipped with a binary relation. By considering only the…

Machine Learning · Computer Science 2016-04-06 Henry Gouk , Bernhard Pfahringer , Michael Cree

Vectors are universal mathematical objects that can represent text, images, speech, or a mix of these data modalities. That happens regardless of whether data is represented by hand-crafted features or learnt embeddings. Collect a large…

Data Structures and Algorithms · Computer Science 2024-04-02 Sebastian Bruch
‹ Prev 1 2 3 10 Next ›