English
Related papers

Related papers: Evaluation Evaluation a Monte Carlo study

200 papers

Commonly used evaluation measures including Recall, Precision, F-Measure and Rand Accuracy are biased and should not be used without clear understanding of the biases, and corresponding identification of chance or base case levels of the…

Machine Learning · Computer Science 2020-11-02 David M. W. Powers

Eliciting relevance judgments for ranking evaluation is labor-intensive and costly, motivating careful selection of which documents to judge. Unlike traditional approaches that make this selection deterministically, probabilistic sampling…

Information Retrieval · Computer Science 2016-04-26 Tobias Schnabel , Adith Swaminathan , Peter Frazier , Thorsten Joachims

As the vocabulary size of modern word-based language models becomes ever larger, many sampling-based training criteria are proposed and investigated. The essence of these sampling methods is that the softmax-related traversal over the…

Computation and Language · Computer Science 2021-06-18 Yingbo Gao , David Thulke , Alexander Gerstenberger , Khoa Viet Tran , Ralf Schlüter , Hermann Ney

Classification systems are evaluated in a countless number of papers. However, we find that evaluation practice is often nebulous. Frequently, metrics are selected without arguments, and blurry terminology invites misconceptions. For…

Machine Learning · Computer Science 2024-07-03 Juri Opitz

A myriad of explainability methods have been proposed in recent years, but there is little consensus on how to evaluate them. While automatic metrics allow for quick benchmarking, it isn't clear how such metrics reflect human interaction…

Computation and Language · Computer Science 2021-06-30 Ana Valeria Gonzalez , Anna Rogers , Anders Søgaard

Recent discussions on alternative facts, fake news, and post truth politics have motivated research on creating technologies that allow people not only to access information, but also to assess the credibility of the information presented…

Information Retrieval · Computer Science 2017-08-25 Christina Lioma , Jakob Grue Simonsen , Birger Larsen

In predictive modeling with simulation or machine learning, it is critical to accurately assess the quality of estimated values through output analysis. In recent decades output analysis has become enriched with methods that quantify the…

Methodology · Statistics 2023-10-27 Kimia Vahdat , Sara Shashaani

Monte Carlo methods, Variational Inference, and their combinations play a pivotal role in sampling from intractable probability distributions. However, current studies lack a unified evaluation framework, relying on disparate performance…

Machine Learning · Computer Science 2024-06-12 Denis Blessing , Xiaogang Jia , Johannes Esslinger , Francisco Vargas , Gerhard Neumann

Evaluation is the central means for assessing, understanding, and communicating about NLP models. In this position paper, we argue evaluation should be more than that: it is a force for driving change, carrying a sociological and political…

Computation and Language · Computer Science 2022-12-23 Rishi Bommasani

We describe Monte Carlo methods for estimating lower envelopes of expectations of real random variables. We prove that the estimation bias is negative and that its absolute value shrinks with increasing sample size. We discuss fairly…

Probability · Mathematics 2019-09-02 Arne Decadt , Gert de Cooman , Jasper De Bock

The F-measure or F-score is one of the most commonly used single number measures in Information Retrieval, Natural Language Processing and Machine Learning, but it is based on a mistake, and the flawed assumptions render it unsuitable for…

Information Retrieval · Computer Science 2019-09-13 David M. W. Powers

Importance sampling is a common technique for Monte Carlo approximation, including Monte Carlo approximation of p-values. Here it is shown that a simple correction of the usual importance sampling p-values creates valid p-values, meaning…

Computation · Statistics 2011-04-12 Matthew T. Harrison

We introduce a generalization of classic information-theoretic measures of predictive uncertainty in online language processing, based on the simulation of expected continuations of incremental linguistic contexts. Our framework provides a…

Computation and Language · Computer Science 2024-10-15 Mario Giulianelli , Andreas Opedal , Ryan Cotterell

We use Monte Carlo techniques to simulate an organized prediction competition between a group of a scientific experts acting under the influence of a ``self-governing'' prediction reward algorithm. Our aim is to illustrate the advantages of…

Social and Information Networks · Computer Science 2023-05-09 J. O. Gonzalez-Hernandez , Jonathan Marino , Ted Rogers , Brandon Velasco

The correlation between NLG automatic evaluation metrics and human evaluation is often regarded as a critical criterion for assessing the capability of an evaluation metric. However, different grouping methods and correlation coefficients…

Computation and Language · Computer Science 2025-01-28 Mingqi Gao , Xinyu Hu , Li Lin , Xiaojun Wan

A series of monte carlo studies were performed to compare the behavior of some alternative procedures for reasoning under uncertainty. The behavior of several Bayesian, linear model and default reasoning procedures were examined in the…

Artificial Intelligence · Computer Science 2013-03-26 Paul E. Lehner , Azar Sadigh

Process Reward Models (PRMs) emerge as a promising approach for process supervision in mathematical reasoning of Large Language Models (LLMs), which aim to identify and mitigate intermediate errors in the reasoning processes. However, the…

Computation and Language · Computer Science 2025-06-06 Zhenru Zhang , Chujie Zheng , Yangzhen Wu , Beichen Zhang , Runji Lin , Bowen Yu , Dayiheng Liu , Jingren Zhou , Junyang Lin

Confirmation bias is a cognitive bias that adversely affects management decisions, and mathematical modelling is an aid to its detailed understanding. Bias in opinion update about the value of a parameter is modelled here assuming that…

Other Statistics · Statistics 2022-02-08 Rose D Baker

To improve the efficiency of Monte Carlo estimation, practitioners are turning to biased Markov chain Monte Carlo procedures that trade off asymptotic exactness for computational speed. The reasoning is sound: a reduction in variance due to…

Machine Learning · Statistics 2019-01-03 Jackson Gorham , Lester Mackey

With origins in game theory, probabilistic values like Shapley values, Banzhaf values, and semi-values have emerged as a central tool in explainable AI. They are used for feature attribution, data attribution, data valuation, and more.…

Machine Learning · Computer Science 2026-01-14 R. Teal Witter , Yurong Liu , Christopher Musco
‹ Prev 1 2 3 10 Next ›