Related papers: CompText: Visualizing, Comparing & Understanding T…

Joint Word Representation Learning using a Corpus and a Semantic Lexicon

Methods for learning word representations using large text corpora have received much attention lately due to their impressive performance in numerous natural language processing (NLP) tasks such as, semantic similarity measurement, and…

Computation and Language · Computer Science 2015-11-23 Danushka Bollegala , Alsuhaibani Mohammed , Takanori Maehara , Ken-ichi Kawarabayashi

Finding Good Representations of Emotions for Text Classification

It is important for machines to interpret human emotions properly for better human-machine communications, as emotion is an essential part of human-to-human communications. One aspect of emotion is reflected in the language we use. How to…

Computation and Language · Computer Science 2018-08-23 Ji Ho Park

Generalized Word Shift Graphs: A Method for Visualizing and Explaining Pairwise Comparisons Between Texts

A common task in computational text analyses is to quantify how two corpora differ according to a measurement like word frequency, sentiment, or information content. However, collapsing the texts' rich stories into a single number is often…

Computation and Language · Computer Science 2021-02-05 Ryan J. Gallagher , Morgan R. Frank , Lewis Mitchell , Aaron J. Schwartz , Andrew J. Reagan , Christopher M. Danforth , Peter Sheridan Dodds

Dimensional Modeling of Emotions in Text with Appraisal Theories: Corpus Creation, Annotation Reliability, and Prediction

The most prominent tasks in emotion analysis are to assign emotions to texts and to understand how emotions manifest in language. An observation for NLP is that emotions can be communicated implicitly by referring to events, appealing to an…

Computation and Language · Computer Science 2022-10-10 Enrica Troiano , Laura Oberländer , Roman Klinger

Dealing with Controversy: An Emotion and Coping Strategy Corpus Based on Role Playing

There is a mismatch between psychological and computational studies on emotions. Psychological research aims at explaining and documenting internal mechanisms of these phenomena, while computational work often simplifies them into labels.…

Computation and Language · Computer Science 2026-02-03 Enrica Troiano , Sofie Labat , Marco Antonio Stranisci , Viviana Patti , Rossana Damiano , Roman Klinger

Visualizing Linguistic Shift

Neural network based models are a very powerful tool for creating word embeddings, the objective of these models is to group similar words together. These embeddings have been used as features to improve results in various applications such…

Computation and Language · Computer Science 2016-11-27 Salman Mahmood , Rami Al-Rfou , Klaus Mueller

What do Language Representations Really Represent?

A neural language model trained on a text corpus can be used to induce distributed representations of words, such that similar words end up with similar representations. If the corpus is multilingual, the same model can be used to learn…

Computation and Language · Computer Science 2019-01-10 Johannes Bjerva , Robert Östling , Maria Han Veiga , Jörg Tiedemann , Isabelle Augenstein

On the Learnability of Concepts: With Applications to Comparing Word Embedding Algorithms

Word Embeddings are used widely in multiple Natural Language Processing (NLP) applications. They are coordinates associated with each word in a dictionary, inferred from statistical properties of these words in a large corpus. In this paper…

Computation and Language · Computer Science 2020-06-18 Adam Sutton , Nello Cristianini

An NLP approach to quantify dynamic salience of predefined topics in a text corpus

The proliferation of news media available online simultaneously presents a valuable resource and significant challenge to analysts aiming to profile and understand social and cultural trends in a geographic location of interest. While an…

Computation and Language · Computer Science 2021-08-18 A. Bock , A. Palladino , S. Smith-Heisters , I. Boardman , E. Pellegrini , E. J. Bienenstock , A. Valenti

Automatic Concept Discovery from Parallel Text and Visual Corpora

Humans connect language and vision to perceive the world. How to build a similar connection for computers? One possible way is via visual concepts, which are text terms that relate to visually discriminative entities. We propose an…

Computer Vision and Pattern Recognition · Computer Science 2015-09-25 Chen Sun , Chuang Gan , Ram Nevatia

Visualizing Topics with Multi-Word Expressions

We describe a new method for visualizing topics, the distributions over terms that are automatically extracted from large text corpora using latent variable models. Our method finds significant $n$-grams related to a topic, which are then…

Machine Learning · Statistics 2009-07-07 David M. Blei , John D. Lafferty

Thematic Analysis and Visualization of Textual Corpus

The semantic analysis of documents is a domain of intense research at present. The works in this domain can take several directions and touch several levels of granularity. In the present work we are exactly interested in the thematic…

Information Retrieval · Computer Science 2011-12-12 Anja Habacha Chabi , Ferihane Kboubi , Mohamed Ben Ahmed

A Unified Understanding of Deep NLP Models for Text Classification

The rapid development of deep natural language processing (NLP) models for text classification has led to an urgent need for a unified understanding of these models proposed individually. Existing methods cannot meet the need for…

Computation and Language · Computer Science 2022-06-22 Zhen Li , Xiting Wang , Weikai Yang , Jing Wu , Zhengyan Zhang , Zhiyuan Liu , Maosong Sun , Hui Zhang , Shixia Liu

Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts

The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases based on the sheer volume and velocity of textual data. Natural language processing (NLP) is a subfield of…

Computation and Language · Computer Science 2023-07-07 Alexandrea K. Ramnarine

Measuring Conceptual Entanglement in Collections of Documents

Conceptual entanglement is a crucial phenomenon in quantum cognition because it implies that classical probabilities cannot model non--compositional conceptual phenomena. While several psychological experiments have been developed to test…

Computation and Language · Computer Science 2019-09-24 Tomas Veloz , Xiazhao Zhao , Diederik Aerts

Topics in the Haystack: Extracting and Evaluating Topics beyond Coherence

Extracting and identifying latent topics in large text corpora has gained increasing importance in Natural Language Processing (NLP). Most models, whether probabilistic models similar to Latent Dirichlet Allocation (LDA) or neural topic…

Computation and Language · Computer Science 2023-03-31 Anton Thielmann , Quentin Seifert , Arik Reuter , Elisabeth Bergherr , Benjamin Säfken

Large Language Models Struggle to Describe the Haystack without Human Help: Human-in-the-loop Evaluation of Topic Models

A common use of NLP is to facilitate the understanding of large document collections, with a shift from using traditional topic models to Large Language Models. Yet the effectiveness of using LLM for large corpus understanding in real-world…

Computation and Language · Computer Science 2025-06-05 Zongxia Li , Lorena Calvo-Bartolomé , Alexander Hoyle , Paiheng Xu , Alden Dima , Juan Francisco Fung , Jordan Boyd-Graber

Natural Language Decompositions of Implicit Content Enable Better Text Representations

When people interpret text, they rely on inferences that go beyond the observed language itself. Inspired by this observation, we introduce a method for the analysis of text that takes implicitly communicated content explicitly into…

Computation and Language · Computer Science 2025-02-25 Alexander Hoyle , Rupak Sarkar , Pranav Goel , Philip Resnik

Analyzing the impact of climate change on critical infrastructure from the scientific literature: A weakly supervised NLP approach

Natural language processing (NLP) is a promising approach for analyzing large volumes of climate-change and infrastructure-related scientific literature. However, best-in-practice NLP techniques require large collections of relevant…

Machine Learning · Computer Science 2023-02-07 Tanwi Mallick , Joshua David Bergerson , Duane R. Verner , John K Hutchison , Leslie-Anne Levy , Prasanna Balaprakash

Know thy corpus! Robust methods for digital curation of Web corpora

This paper proposes a novel framework for digital curation of Web corpora in order to provide robust estimation of their parameters, such as their composition and the lexicon. In recent years language models pre-trained on large corpora…

Computation and Language · Computer Science 2020-03-16 Serge Sharoff