Related papers: Specificity measures and reference

Training conformal predictors

Efficiency criteria for conformal prediction, such as \emph{observed fuzziness} (i.e., the sum of p-values associated with false labels), are commonly used to \emph{evaluate} the performance of given conformal predictors. Here, we…

Machine Learning · Computer Science 2020-05-15 Nicolo Colombo , Vladimir Vovk

Accuracy Measures for the Comparison of Classifiers

The selection of the best classification algorithm for a given dataset is a very widespread problem. It is also a complex one, in the sense it requires to make several important methodological choices. Among them, in this work we focus on…

Machine Learning · Computer Science 2012-07-18 Vincent Labatut , Hocine Cherifi

Comprehension-guided referring expressions

We consider generation and comprehension of natural language referring expression for objects in an image. Unlike generic "image captioning" which lacks natural standard evaluation criteria, quality of a referring expression may be measured…

Computer Vision and Pattern Recognition · Computer Science 2017-01-13 Ruotian Luo , Gregory Shakhnarovich

Feature selection when there are many influential features

Recent discussion of the success of feature selection methods has argued that focusing on a relatively small number of features has been counterproductive. Instead, it is suggested, the number of significant features can be in the thousands…

Statistics Theory · Mathematics 2014-07-10 Peter Hall , Jiashun Jin , Hugh Miller

Reliable Classification with Conformal Learning and Interval-Type 2 Fuzzy Sets

Classical machine learning classifiers tend to be overconfident can be unreliable outside of the laboratory benchmarks. Properly assessing the reliability of the output of the model per sample is instrumental for real-life scenarios where…

Artificial Intelligence · Computer Science 2025-11-07 Javier Fumanal-Idocin , Javier Andreu-Perez

Fuzzy Rankings: Properties and Applications

In practice, a ranking of objects with respect to given set of criteria is of considerable importance. However, due to lack of knowledge, information of time pressure, decision makers might not be able to provide a (crisp) ranking of…

Artificial Intelligence · Computer Science 2017-03-16 Jiří Mazurek

User Validation of Recommendation Serendipity Metrics

Though it has been recognized that recommending serendipitous (i.e., surprising and relevant) items can be helpful for increasing users' satisfaction and behavioral intention, how to measure serendipity in the offline environment is still…

Human-Computer Interaction · Computer Science 2020-04-23 Li Chen , Ningxia Wang , Yonghua Yang , Keping Yang , Quan Yuan

Fuzzing: On Benchmarking Outcome as a Function of Benchmark Properties

Characteristics of a benchmarking setup clearly can have some impact on the benchmark outcome. In this paper, we explore two methodologies to quantify the impact of the specific properties on the benchmarking outcome. Our first methodology…

Software Engineering · Computer Science 2025-04-15 Dylan Wolff , Marcel Böhme , Abhik Roychoudhury

Analysing Fuzzy Sets Through Combining Measures of Similarity and Distance

Reasoning with fuzzy sets can be achieved through measures such as similarity and distance. However, these measures can often give misleading results when considered independently, for example giving the same value for two different pairs…

Artificial Intelligence · Computer Science 2014-09-04 Josie McCulloch , Christian Wagner , Uwe Aickelin

On the Convergent Properties of Word Embedding Methods

Do word embeddings converge to learn similar things over different initializations? How repeatable are experiments with word embeddings? Are all word embedding techniques equally reliable? In this paper we propose evaluating methods for…

Computation and Language · Computer Science 2016-05-13 Yingtao Tian , Vivek Kulkarni , Bryan Perozzi , Steven Skiena

Accuracy, Estimates, and Representation Results

Measures of accuracy usually score how accurate a specified credence depending on whether the proposition is true or false. A key requirement for such measures is strict propriety; that probabilities expect themselves to be most accurate.…

Probability · Mathematics 2024-12-11 Catrin Campbell-Moore

The formal definition of reference priors

Reference analysis produces objective Bayesian inference, in the sense that inferential statements depend only on the assumed model and the available data, and the prior distribution used to make an inference is least informative in a…

Statistics Theory · Mathematics 2009-04-02 James O. Berger , José M. Bernardo , Dongchu Sun

Coherency in One-Shot Gesture Recognition

User's intentions may be expressed through spontaneous gesturing, which have been seen only a few times or never before. Recognizing such gestures involves one shot gesture learning. While most research has focused on the recognition of the…

Human-Computer Interaction · Computer Science 2017-01-24 Maria Cabrera , Richard Voyles , Juan Wachs

Decomposable Probability-of-Success Metrics in Algorithmic Search

Previous studies have used a specific success metric within an algorithmic search framework to prove machine learning impossibility results. However, this specific success metric prevents us from applying these results on other forms of…

Machine Learning · Statistics 2020-01-06 Tyler Sam , Jake Williams , Abel Tadesse , Huey Sun , George Montanez

Good Classification Measures and How to Find Them

Several performance measures can be used for evaluating classification results: accuracy, F-measure, and many others. Can we say that some of them are better than others, or, ideally, choose one measure that is best in all situations? To…

Machine Learning · Computer Science 2022-01-25 Martijn Gösgens , Anton Zhiyanov , Alexey Tikhonov , Liudmila Prokhorenkova

A Fuzzy Approach to Project Success: Measuring What Matters

This paper introduces a novel approach to project success evaluation by integrating fuzzy logic into an existing construct. Traditional Likert-scale measures often overlook the context-dependent and multifaceted nature of project success.…

Software Engineering · Computer Science 2025-07-18 João Granja-Correia , Remedios Hernández-Linares , Luca Ferranti , Arménio Rego

Predictive Software Measures based on Z Specifications - A Case Study

Estimating the effort and quality of a system is a critical step at the beginning of every software project. It is necessary to have reliable ways of calculating these measures, and, it is even better when the calculation can be done as…

Software Engineering · Computer Science 2012-07-11 Andreas Bollin , Abdollah Tabareh

Extending F1 metric, probabilistic approach

This article explores the extension of well-known F1 score used for assessing the performance of binary classifiers. We propose the new metric using probabilistic interpretation of precision, recall, specificity, and negative predictive…

Machine Learning · Computer Science 2024-04-17 Mikolaj Sitarz

Towards an Improved Performance Measure for Language Models

In this paper a first attempt at deriving an improved performance measure for language models, the probability ratio measure (PRM) is described. In a proof of concept experiment, it is shown that PRM correlates better with recognition…

cmp-lg · Computer Science 2007-05-23 Joerg P. Ueberla

A Statistical Analysis for Per-Instance Evaluation of Stochastic Optimizers: Avoiding Unreliable Conclusions

A key trait of stochastic optimizers is that multiple runs of the same optimizer in attempting to solve the same problem can produce different results. As a result, their performance is evaluated over several repeats, or runs, on the…

Machine Learning · Computer Science 2026-05-18 Moslem Noori , Elisabetta Valiante , Thomas Van Vaerenbergh , Masoud Mohseni , Ignacio Rozada