English
Related papers

Related papers: A statistical Testing Procedure for Validating Cla…

200 papers

Classification is a fundamental problem in machine learning and data mining. During the past decades, numerous classification methods have been presented based on different principles. However, most existing classifiers cast the…

Machine Learning · Computer Science 2019-04-23 Zengyou He , Chaohua Sheng , Yan Liu , Quan Zou

Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…

Machine Learning · Computer Science 2022-07-20 Weizhi Li , Gautam Dasarathy , Karthikeyan Natesan Ramamurthy , Visar Berisha

Unbiased, label-free proteomics is becoming a powerful technique for measuring protein expression in almost any biological sample. The output of these measurements after preprocessing is a collection of features and their associated…

In supervised learning, automatically assessing the quality of the labels before any learning takes place remains an open research question. In certain particular cases, hypothesis testing procedures have been proposed to assess whether a…

Machine Learning · Computer Science 2023-12-19 Weisong Yang , Rafael Poyiadzi , Niall Twomey , Raul Santos Rodriguez

Deep models trained with noisy labels are prone to over-fitting and struggle in generalization. Most existing solutions are based on an ideal assumption that the label noise is class-conditional, i.e., instances of the same class share the…

Computer Vision and Pattern Recognition · Computer Science 2022-08-01 Ganlong Zhao , Guanbin Li , Yipeng Qin , Feng Liu , Yizhou Yu

Annotating multi-class instances is a crucial task in the field of machine learning. Unfortunately, identifying the correct class label from a long sequence of candidate labels is time-consuming and laborious. To alleviate this problem, we…

Machine Learning · Computer Science 2025-12-05 Meng Wei , Zhongnian Li , Yong Zhou , Qiaoyu Guo , Xinzheng Xu

A new method, with an application program in Matlab code, is proposed for testing item performance models on empirical databases. This method uses data intraclass correlation statistics as expected correlations to which one compares simple…

Testing whether the observed data conforms to a purported model (probability distribution) is a basic and fundamental statistical task, and one that is by now well understood. However, the standard formulation, identity testing, fails to…

Statistics Theory · Mathematics 2021-05-06 Clément L. Canonne , Karl Wimmer

In a shotgun proteomics experiment, proteins are the most biologically meaningful output. The success of proteomics studies depends on the ability to accurately and efficiently identify proteins. Many methods have been proposed to…

Quantitative Methods · Quantitative Biology 2012-11-30 Chao Yang , Zengyou He , Weichuan Yu

Instance-level image classification tasks have traditionally relied on single-instance labels to train models, e.g., few-shot learning and transfer learning. However, set-level coarse-grained labels that capture relationships among…

Machine Learning · Computer Science 2023-11-21 Renyu Zhang , Aly A. Khan , Yuxin Chen , Robert L. Grossman

Partial-label learning is a kind of weakly-supervised learning with inexact labels, where for each training example, we are given a set of candidate labels instead of only one true label. Recently, various approaches on partial-label…

Machine Learning · Computer Science 2022-08-30 Zhenguo Wu , Jiaqi Lv , Masashi Sugiyama

Learning from Label Proportions (LLP) is a weakly supervised learning method that aims to perform instance classification from training data consisting of pairs of bags containing multiple instances and the class label proportions within…

Machine Learning · Computer Science 2023-02-22 Ryoma Kobayashi , Yusuke Mukuta , Tatsuya Harada

Classification and clustering are both important topics in statistical learning. A natural question herein is whether predefined classes are really different from one another, or whether clusters are really there. Specifically, we may be…

Machine Learning · Statistics 2015-09-22 Qiyi Lu , Xingye Qiao

The ultimate target of proteomics identification is to identify and quantify the protein in the organism. Mass spectrometry (MS) based on label-free protein quantitation has mainly focused on analysis of peptide spectral counts and ion peak…

Quantitative Methods · Quantitative Biology 2013-12-05 Biao He , Baochang Zhang , Yan Fu

This paper introduces a statistical test inferring whether a variable allows separating two classes by means of a single critical value. Its test statistic is the prediction error of a nonparametric threshold classifier. While this approach…

Methodology · Statistics 2017-07-17 Fabian Schroeder

In this contribution, we augment the metric learning setting by introducing a parametric pseudo-distance, trained jointly with the encoder. Several interpretations are thus drawn for the learned distance-like model's output. We first show…

Machine Learning · Computer Science 2020-08-17 Joao Monteiro , Isabela Albuquerque , Jahangir Alam , R Devon Hjelm , Tiago Falk

Object detection is a task that performs position identification and label classification of objects in images or videos. The information obtained through this process plays an essential role in various tasks in the field of computer…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Heewon Lee , Sangtae Ahn

Sampling algorithms play a pivotal role in probabilistic AI. However, verifying if a sampler program indeed samples from the claimed distribution is a notoriously hard problem. Provably correct testers like Barbarik, Teq, Flash, CubeProbe…

Data Structures and Algorithms · Computer Science 2025-12-09 Rishiraj Bhattacharyya , Sourav Chakraborty , Yash Pote , Uddalok Sarkar , Sayantan Sen

Distance-based unsupervised text classification is a method within text classification that leverages the semantic similarity between a label and a text to determine label relevance. This method provides numerous benefits, including fast…

Computation and Language · Computer Science 2025-10-14 Jens Van Nooten , Andriy Kosar , Guy De Pauw , Walter Daelemans

Motivation: Assigning statistical significance accurately has become increasingly important as meta data of many types, often assembled in hierarchies, are constructed and combined for further biological analyses. Statistical inaccuracy of…

Quantitative Methods · Quantitative Biology 2014-07-25 Gelio Alves , Yi-Kuo Yu
‹ Prev 1 2 3 10 Next ›