Related papers: Identifying missing dictionary entries with freque…

A Classification Approach to Word Prediction

The eventual goal of a language model is to accurately predict the value of a missing word given its context. We present an approach to word prediction that is based on learning a representation for each word as a function of words and…

Computation and Language · Computer Science 2007-05-23 Yair Even-Zohar , Dan Roth

Investigating the Working of Text Classifiers

Text classification is one of the most widely studied tasks in natural language processing. Motivated by the principle of compositionality, large multilayer neural network models have been employed for this task in an attempt to effectively…

Computation and Language · Computer Science 2018-08-07 Devendra Singh Sachan , Manzil Zaheer , Ruslan Salakhutdinov

Context-theoretic Semantics for Natural Language: an Algebraic Framework

Techniques in which words are represented as vectors have proved useful in many applications in computational linguistics, however there is currently no general semantic formalism for representing meaning in terms of vectors. We present a…

Computation and Language · Computer Science 2020-09-23 Daoud Clarke

Leveraging Cognitive Search Patterns to Enhance Automated Natural Language Retrieval Performance

The search of information in large text repositories has been plagued by the so-called document-query vocabulary gap, i.e. the semantic discordance between the contents in the stored document entities on the one hand and the human query on…

Information Retrieval · Computer Science 2020-04-22 Bhawani Selvaretnam , Mohammed Belkhatir

A Context-theoretic Framework for Compositionality in Distributional Semantics

Techniques in which words are represented as vectors have proved useful in many applications in computational linguistics, however there is currently no general semantic formalism for representing meaning in terms of vectors. We present a…

Computation and Language · Computer Science 2015-03-17 Daoud Clarke

Document Retrieval for Large Scale Content Analysis using Contextualized Dictionaries

This paper presents a procedure to retrieve subsets of relevant documents from large text collections for Content Analysis, e.g. in social sciences. Document retrieval for this purpose needs to take account of the fact that analysts often…

Information Retrieval · Computer Science 2017-07-12 Gregor Wiedemann , Andreas Niekler

Mark my Word: A Sequence-to-Sequence Approach to Definition Modeling

Defining words in a textual context is a useful task both for practical purposes and for gaining insight into distributed word representations. Building on the distributional hypothesis, we argue here that the most natural formalization of…

Computation and Language · Computer Science 2019-11-14 Timothee Mickus , Denis Paperno , Mathieu Constant

Learning to Describe Phrases with Local and Global Contexts

When reading a text, it is common to become stuck on unfamiliar words and phrases, such as polysemous words with novel senses, rarely used idioms, internet slang, or emerging entities. If we humans cannot figure out the meaning of those…

Computation and Language · Computer Science 2019-04-11 Shonosuke Ishiwatari , Hiroaki Hayashi , Naoki Yoshinaga , Graham Neubig , Shoetsu Sato , Masashi Toyoda , Masaru Kitsuregawa

Context-Aware Sentence/Passage Term Importance Estimation For First Stage Retrieval

Term frequency is a common method for identifying the importance of a term in a query or document. But it is a weak signal, especially when the frequency distribution is flat, such as in long queries or short documents where the text is of…

Information Retrieval · Computer Science 2019-11-28 Zhuyun Dai , Jamie Callan

Exponential Family Embeddings

Word embeddings are a powerful approach for capturing semantic similarity among terms in a vocabulary. In this paper, we develop exponential family embeddings, a class of methods that extends the idea of word embeddings to other types of…

Machine Learning · Statistics 2016-11-22 Maja R. Rudolph , Francisco J. R. Ruiz , Stephan Mandt , David M. Blei

Noun-Phrase Analysis in Unrestricted Text for Information Retrieval

Information retrieval is an important application area of natural-language processing where one encounters the genuine challenge of processing large quantities of unrestricted natural-language text. This paper reports on the application of…

cmp-lg · Computer Science 2008-02-03 David A. Evans , Chengxiang Zhai

Sequential Modelling with Applications to Music Recommendation, Fact-Checking, and Speed Reading

Sequential modelling entails making sense of sequential data, which naturally occurs in a wide array of domains. One example is systems that interact with users, log user actions and behaviour, and make recommendations of items of potential…

Information Retrieval · Computer Science 2021-09-15 Christian Hansen

Remedies against the Vocabulary Gap in Information Retrieval

Search engines rely heavily on term-based approaches that represent queries and documents as bags of words. Text---a document or a query---is represented by a bag of its words that ignores grammar and word order, but retains word frequency…

Information Retrieval · Computer Science 2017-11-17 Christophe Van Gysel

Contextualized word senses: from attention to compositionality

The neural architectures of language models are becoming increasingly complex, especially that of Transformers, based on the attention mechanism. Although their application to numerous natural language processing tasks has proven to be very…

Computation and Language · Computer Science 2023-12-04 Pablo Gamallo

Factorial Hidden Markov Models for Learning Representations of Natural Language

Most representation learning algorithms for language and image processing are local, in that they identify features for a data point based on surrounding points. Yet in language processing, the correct meaning of a word often depends on its…

Machine Learning · Computer Science 2014-02-19 Anjan Nepal , Alexander Yates

Modeling meaning: computational interpreting and understanding of natural language fragments

In this introductory article we present the basics of an approach to implementing computational interpreting of natural language aiming to model the meanings of words and phrases. Unlike other approaches, we attempt to define the meanings…

Computation and Language · Computer Science 2019-08-12 Michael Kapustin , Pavlo Kapustin

Explaining Datasets in Words: Statistical Models with Natural Language Parameters

To make sense of massive data, we often fit simplified models and then interpret the parameters; for example, we cluster the text embeddings and then interpret the mean parameters of each cluster. However, these parameters are often…

Artificial Intelligence · Computer Science 2025-01-14 Ruiqi Zhong , Heng Wang , Dan Klein , Jacob Steinhardt

Documents Are People and Words Are Items: A Psychometric Approach to Textual Data with Contextual Embeddings

This research introduces a novel psychometric method for analyzing textual data using large language models. By leveraging contextual embeddings to create contextual scores, we transform textual data into response data suitable for…

Computation and Language · Computer Science 2025-09-12 Jinsong Chen

Referring Expression Object Segmentation with Caption-Aware Consistency

Referring expressions are natural language descriptions that identify a particular object within a scene and are widely used in our daily conversations. In this work, we focus on segmenting the object in an image specified by a referring…

Computer Vision and Pattern Recognition · Computer Science 2019-10-11 Yi-Wen Chen , Yi-Hsuan Tsai , Tiantian Wang , Yen-Yu Lin , Ming-Hsuan Yang

Classifying text using machine learning models and determining conversation drift

Text classification helps analyse texts for semantic meaning and relevance, by mapping the words against this hierarchy. An analysis of various types of texts is invaluable to understanding both their semantic meaning, as well as their…

Machine Learning · Computer Science 2022-11-16 Chaitanya Chadha , Vandit Gupta , Deepak Gupta , Ashish Khanna