English
Related papers

Related papers: Assessing Keyness using Permutation Tests

200 papers

Standard evaluations of Large language models (LLMs) focus on task performance, offering limited insight into whether correct behavior reflects appropriate underlying mechanisms and risking confirmation bias. We introduce a simple,…

Computation and Language · Computer Science 2026-04-01 Zoë Prins , Samuele Punzo , Frank Wildenburg , Giovanni Cinà , Sandro Pezzelle

Measuring the breadth of a word's meaning, or its spread across contexts, has become feasible with contextualized token embeddings. A word type can be represented as a cloud of token vectors, with dispersion-based statistics serving as…

Computation and Language · Computer Science 2026-05-11 Yo Ehara

Language models (LMs) estimate a probability distribution over strings in a natural language; these distributions are crucial for computing perplexity and surprisal in linguistics research. While we are usually concerned with measuring…

Computation and Language · Computer Science 2024-10-15 Tiago Pimentel , Clara Meister

Permutation tests are widely used for statistical hypothesis testing when the sampling distribution of the test statistic under the null hypothesis is analytically intractable or unreliable due to finite sample sizes. One critical challenge…

Computation · Statistics 2023-08-29 Yang Shi , Huining Kang , Ji-Hyun Lee , Hui Jiang

Recent trends in natural language processing research and annotation tasks affirm a paradigm shift from the traditional reliance on a single ground truth to a focus on individual perspectives, particularly in subjective tasks. In scenarios…

Computation and Language · Computer Science 2024-04-18 Olufunke O. Sarumi , Béla Neuendorf , Joan Plepi , Lucie Flek , Jörg Schlötterer , Charles Welch

The emergence of large language models (LLMs) has revolutionized numerous applications across industries. However, their "black box" nature often hinders the understanding of how they make specific decisions, raising concerns about their…

Artificial Intelligence · Computer Science 2024-03-06 Stefan Hackmann , Haniyeh Mahmoudian , Mark Steadman , Michael Schmidt

A high degree of topical diversity is often considered to be an important characteristic of interesting text documents. A recent proposal for measuring topical diversity identifies three elements for assessing diversity: words, topics, and…

Information Retrieval · Computer Science 2017-01-17 Hosein Azarbonyad , Mostafa Dehghani , Tom Kenter , Maarten Marx , Jaap Kamps , Maarten de Rijke

Embeddings of words and concepts capture syntactic and semantic regularities of language; however, they have seen limited use as tools to study characteristics of different corpora and how they relate to one another. We introduce…

Computation and Language · Computer Science 2021-03-23 Denis Newman-Griffis , Venkatesh Sivaraman , Adam Perer , Eric Fosler-Lussier , Harry Hochheiser

In this paper I propose a new way of measuring linguistic productivity that objectively assesses the ability of an affix to be used to coin new complex words and, unlike other popular measures, is not directly dependent upon token…

Computation and Language · Computer Science 2023-08-25 Sergei Monakhov

The classical method of the thematic classification of texts is based on using the frequency weight on the list of words occurring in texts from the text corpus that determines the theme. In this method , the weight of each word is defined…

Optimization and Control · Mathematics 2017-01-31 Mikhail A. Antonets , Grigoriy P. Kogan

The prevailing assumption of an exponential decay in large language model (LLM) reliability with sequence length, predicated on independent per-token error probabilities, posits an inherent limitation for long autoregressive outputs. Our…

Computation and Language · Computer Science 2026-05-07 Mikhail L. Arbuzov , Sisong Bei , Ziwei Dong , Dmitri Kalaev , Alexey A. Shvets

Neural language models typically tokenise input text into sub-word units to achieve an open vocabulary. The standard approach is to use a single canonical tokenisation at both train and test time. We suggest that this approach is…

Computation and Language · Computer Science 2021-09-22 Kris Cao , Laura Rimell

Methods for learning word representations using large text corpora have received much attention lately due to their impressive performance in numerous natural language processing (NLP) tasks such as, semantic similarity measurement, and…

Computation and Language · Computer Science 2015-11-23 Danushka Bollegala , Alsuhaibani Mohammed , Takanori Maehara , Ken-ichi Kawarabayashi

Cross-Language Information Retrieval (CLIR) and machine translation (MT) resources, such as dictionaries and parallel corpora, are scarce and hard to come by for special domains. Besides, these resources are just limited to a few languages,…

Computation and Language · Computer Science 2013-02-20 Sa Liu , Chengzhi Zhang

Permutation methods are commonly used to test significance of regressors of interest in general linear models (GLMs) for functional (image) data sets, in particular for neuroimaging applications as they rely on mild assumptions. Permutation…

Methodology · Statistics 2021-11-23 Tomas Mrkvicka , Mari Myllymaki , Mikko Kuronen , Naveen Naidu Narisetty

Statistical significance tests can provide evidence that the observed difference in performance between two methods is not due to chance. In Information Retrieval, some studies have examined the validity and suitability of such tests for…

Information Retrieval · Computer Science 2019-04-09 Javier Parapar , David E. Losada , Manuel A. Presedo-Quindimil , Alvaro Barreiro

Much work has been done on designing fast and accurate sampling for diffusion language models (dLLMs). However, these efforts have largely focused on the tradeoff between speed and quality of individual samples; how to additionally ensure…

In Bayesian statistics, the marginal likelihood (ML) is the key ingredient needed for model comparison and model averaging. Unfortunately, estimating MLs accurately is notoriously difficult, especially for models where posterior simulation…

Computation · Statistics 2023-12-12 Dennis Christensen , Per August Jarval Moen

Computing next-token likelihood ratios between two language models (LMs) is a standard task in training paradigms such as knowledge distillation. Since this requires both models to share the same probability space, it becomes challenging…

Computation and Language · Computer Science 2026-05-07 Buu Phan , Ashish Khisti , Karen Ullrich

Accurately quantifying uncertainty in large language models (LLMs) is crucial for their reliable deployment, especially in high-stakes applications. Current state-of-the-art methods for measuring semantic uncertainty in LLMs rely on strict…

Machine Learning · Computer Science 2024-10-31 Yashvir S. Grewal , Edwin V. Bonilla , Thang D. Bui
‹ Prev 1 2 3 10 Next ›