Related papers: Evaluating Document Coherence Modelling

Discourse Probing of Pretrained Language Models

Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks. In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture…

Computation and Language · Computer Science 2021-04-14 Fajri Koto , Jey Han Lau , Timothy Baldwin

Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations

Topic models have been the prominent tools for automatic topic discovery from text corpora. Despite their effectiveness, topic models suffer from several limitations including the inability of modeling word ordering information in…

Computation and Language · Computer Science 2022-02-10 Yu Meng , Yunyi Zhang , Jiaxin Huang , Yu Zhang , Jiawei Han

Rethinking stance detection: A theoretically-informed research agenda for user-level inference using language models

Stance detection has emerged as a popular task in natural language processing research, enabled largely by the abundance of target-specific social media data. While there has been considerable research on the development of stance detection…

Computation and Language · Computer Science 2025-02-05 Prasanta Bhattacharya , Hong Zhang , Yiming Cao , Wei Gao , Brandon Siyuan Loh , Joseph J. P. Simons , Liang Ze Wong

Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples

The evaluation of cross-lingual semantic search models is often limited to existing datasets from tasks such as information retrieval and semantic textual similarity. We introduce Cross-Lingual Semantic Discrimination (CLSD), a lightweight…

Computation and Language · Computer Science 2025-10-10 Andrianos Michail , Simon Clematide , Rico Sennrich

Harnessing the Intrinsic Knowledge of Pretrained Language Models for Challenging Text Classification Settings

Text classification is crucial for applications such as sentiment analysis and toxic text filtering, but it still faces challenges due to the complexity and ambiguity of natural language. Recent advancements in deep learning, particularly…

Computation and Language · Computer Science 2024-08-29 Lingyu Gao

How Does Pretraining Improve Discourse-Aware Translation?

Pretrained language models (PLMs) have produced substantial improvements in discourse-aware neural machine translation (NMT), for example, improved coherence in spoken language translation. However, the underlying reasons for their strong…

Computation and Language · Computer Science 2023-06-01 Zhihong Huang , Longyue Wang , Siyou Liu , Derek F. Wong

An Interpretability Evaluation Benchmark for Pre-trained Language Models

While pre-trained language models (LMs) have brought great improvements in many NLP tasks, there is increasing attention to explore capabilities of LMs and interpret their predictions. However, existing works usually focus only on a certain…

Computation and Language · Computer Science 2022-07-29 Yaozong Shen , Lijie Wang , Ying Chen , Xinyan Xiao , Jing Liu , Hua Wu

An Investigation of Language Model Interpretability via Sentence Editing

Pre-trained language models (PLMs) like BERT are being used for almost all language-related tasks, but interpreting their behavior still remains a significant challenge and many important questions remain largely unanswered. In this work,…

Computation and Language · Computer Science 2021-09-28 Samuel Stevens , Yu Su

Can Pretrained Language Models Derive Correct Semantics from Corrupt Subwords under Noise?

For Pretrained Language Models (PLMs), their susceptibility to noise has recently been linked to subword segmentation. However, it is unclear which aspects of segmentation affect their understanding. This study assesses the robustness of…

Computation and Language · Computer Science 2024-10-15 Xinzhe Li , Ming Liu , Shang Gao

Latent Reasoning via Sentence Embedding Prediction

Autoregressive language models (LMs) generate one token at a time, yet human reasoning operates over higher-level abstractions - sentences, propositions, and concepts. This contrast raises a central question- Can LMs likewise learn to…

Computation and Language · Computer Science 2025-10-14 Hyeonbin Hwang , Byeongguk Jeon , Seungone Kim , Jiyeon Kim , Hoyeon Chang , Sohee Yang , Seungpil Won , Dohaeng Lee , Youbin Ahn , Minjoon Seo

Analyzing Neural Discourse Coherence Models

In this work, we systematically investigate how well current models of coherence can capture aspects of text implicated in discourse organisation. We devise two datasets of various linguistic alterations that undermine coherence and test…

Computation and Language · Computer Science 2020-11-13 Youmna Farag , Josef Valvoda , Helen Yannakoudakis , Ted Briscoe

A Sentence is Worth a Thousand Pictures: Can Large Language Models Understand Hum4n L4ngu4ge and the W0rld behind W0rds?

Modern Artificial Intelligence applications show great potential for language-related tasks that rely on next-word prediction. The current generation of Large Language Models (LLMs) have been linked to claims about human-like linguistic…

Computation and Language · Computer Science 2024-09-05 Evelina Leivada , Gary Marcus , Fritz Günther , Elliot Murphy

EduBERT: Pretrained Deep Language Models for Learning Analytics

The use of large pretrained neural networks to create contextualized word embeddings has drastically improved performance on several natural language processing (NLP) tasks. These computationally expensive models have begun to be applied to…

Computers and Society · Computer Science 2019-12-03 Benjamin Clavié , Kobi Gal

Topics in the Haystack: Extracting and Evaluating Topics beyond Coherence

Extracting and identifying latent topics in large text corpora has gained increasing importance in Natural Language Processing (NLP). Most models, whether probabilistic models similar to Latent Dirichlet Allocation (LDA) or neural topic…

Computation and Language · Computer Science 2023-03-31 Anton Thielmann , Quentin Seifert , Arik Reuter , Elisabeth Bergherr , Benjamin Säfken

AStitchInLanguageModels: Dataset and Methods for the Exploration of Idiomaticity in Pre-Trained Language Models

Despite their success in a variety of NLP tasks, pre-trained language models, due to their heavy reliance on compositionality, fail in effectively capturing the meanings of multiword expressions (MWEs), especially idioms. Therefore,…

Computation and Language · Computer Science 2021-09-10 Harish Tayyar Madabushi , Edward Gow-Smith , Carolina Scarton , Aline Villavicencio

How Well Do LLMs Predict Human Behavior? A Measure of their Pretrained Knowledge

Large language models (LLMs) are increasingly used to predict human behavior. We propose a measure for evaluating how much knowledge a pretrained LLM brings to such a prediction: its equivalent sample size, defined as the amount of…

Econometrics · Economics 2026-01-21 Wayne Gao , Sukjin Han , Annie Liang

Probing Critical Learning Dynamics of PLMs for Hate Speech Detection

Despite the widespread adoption, there is a lack of research into how various critical aspects of pretrained language models (PLMs) affect their performance in hate speech detection. Through five research questions, our findings and…

Computation and Language · Computer Science 2024-02-06 Sarah Masud , Mohammad Aflah Khan , Vikram Goyal , Md Shad Akhtar , Tanmoy Chakraborty

Probing Memes in LLMs: A Paradigm for the Entangled Evaluation World

Current evaluation paradigms for large language models (LLMs) characterize models and datasets separately, yielding coarse descriptions: items in datasets are treated as pre-labeled entries, and models are summarized by overall scores such…

Computation and Language · Computer Science 2026-03-06 Luzhou Peng , Zhengxin Yang , Honglu Ji , Yikang Yang , Fanda Fan , Wanling Gao , Jiayuan Ge , Yilin Han , Jianfeng Zhan

Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning

In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning. However, existing literature has highlighted the…

Computation and Language · Computer Science 2024-02-14 Xinyi Wang , Wanrong Zhu , Michael Saxon , Mark Steyvers , William Yang Wang

COHERENCE: Benchmarking Fine-Grained Image-Text Alignment in Interleaved Multimodal Contexts

In recent years, Multimodal Large Language Models (MLLMs) have achieved remarkable progress on a wide range of multimodal benchmarks. Despite these advances, most existing benchmarks mainly focus on single-image or multi-image…

Computer Vision and Pattern Recognition · Computer Science 2026-05-14 Bingli Wang , Huanze Tang , Haijun Lv , Zhishan Lin , Lixin Gu , Lei Feng , Qipeng Guo , Kai Chen