English
Related papers

Related papers: Statistical Classification via Robust Hypothesis T…

200 papers

We consider Bayesian multiple statistical classification problem in the case where the unknown source distributions are estimated from the labeled training sequences, then the estimates are used as nominal distributions in a robust…

Information Theory · Computer Science 2021-10-11 Hüseyin Afşer

Experiments often yield non-identically distributed data for statistical analysis. Tests of hypothesis under such set-ups are generally performed using the likelihood ratio test, which is non-robust with respect to outliers and model…

Statistics Theory · Mathematics 2017-07-25 Abhik Ghosh , Ayanendranath Basu

In multiple classification, one aims to determine whether a testing sequence is generated from the same distribution as one of the M training sequences or not. Unlike most of existing studies that focus on discrete-valued sequences with…

Machine Learning · Statistics 2024-10-30 Lina Zhu , Lin Zhou

This paper considers the problem of robust hypothesis testing under non-identically distributed data. We propose Wald-type tests for both simple and composite hypothesis for independent but non-homogeneous observations based on the robust…

Methodology · Statistics 2019-05-09 Ayanendranath Basu , Abhik Ghosh , Nirian Martin , Leandro Pardo

Training a deep neural network (DNN) often involves stochastic optimization, which means each run will produce a different model. Several works suggest this variability is negligible when models have the same performance, which in the case…

Machine Learning · Statistics 2023-10-03 Sinjini Banerjee , Reilly Cannon , Tim Marrinan , Tony Chiang , Anand D. Sarwate

Model selection is a central task in statistics, but standard methods are not robust in misspecified settings where the true data-generating process (DGP) is not in the set of candidate models. The key limitation is that existing methods --…

Methodology · Statistics 2026-03-10 Jongwoo Choi , Neil A. Spencer , Jeffrey W. Miller

Robust classification algorithms have been developed in recent years with great success. We take advantage of this development and recast the classical two-sample test problem in the framework of classification. Based on the estimates of…

Statistics Theory · Mathematics 2019-09-18 Haiyan Cai , Bryan Goggin , Qingtang Jiang

We study the question of testing structured properties (classes) of discrete distributions. Specifically, given sample access to an arbitrary distribution $D$ over $[n]$ and a property $\mathcal{P}$, the goal is to distinguish between…

Data Structures and Algorithms · Computer Science 2016-01-22 Clément L. Canonne , Ilias Diakonikolas , Themis Gouleakis , Ronitt Rubinfeld

Robust tests of general composite hypothesis under non-identically distributed observations is always a challenge. Ghosh and Basu (2018, Statistica Sinica, 28, 1133--1155) have proposed a new class of test statistics for such problems based…

Statistics Theory · Mathematics 2019-01-08 Abhik Ghosh , Ayanendranath Basu

The problem of robust binary hypothesis testing is studied. Under both hypotheses, the data-generating distributions are assumed to belong to uncertainty sets constructed through moments; in particular, the sets contain distributions whose…

Statistics Theory · Mathematics 2024-01-09 Akshayaa Magesh , Zhongchang Sun , Venugopal V. Veeravalli , Shaofeng Zou

Training models that perform well under distribution shifts is a central challenge in machine learning. In this paper, we introduce a modeling framework where, in addition to training data, we have partial structural knowledge of the…

Machine Learning · Computer Science 2021-10-28 Tobias Sutter , Andreas Krause , Daniel Kuhn

When sample data are governed by an unknown sequence of independent but possibly non-identical distributions, the data-generating process (DGP) in general cannot be perfectly identified from the data. For making decisions facing such…

Theoretical Economics · Economics 2022-05-11 Xiaoyu Cheng

We discuss recently developed methods that quantify the stability and generalizability of statistical findings under distributional changes. In many practical problems, the data is not drawn i.i.d. from the target population. For example,…

Methodology · Statistics 2023-10-05 Dominik Rothenhäusler , Peter Bühlmann

Currently, statistical tests for random number generators (RNGs) are widely used in practice, and some of them are even included in information security standards. But despite the popularity of RNGs, consistent tests are known only for…

Statistics Theory · Mathematics 2021-05-17 Boris Ryabko

Hypothesis testing in singular statistical models is often regarded as inherently problematic due to non-identifiability and degeneracy of the Fisher information. We show that the fundamental obstruction to testing in such models is not…

Statistics Theory · Mathematics 2026-03-02 Sean Plummer

Contemporary machine learning applications often involve classification tasks with many classes. Despite their extensive use, a precise understanding of the statistical properties and behavior of classification algorithms is still missing,…

Machine Learning · Computer Science 2020-11-17 Christos Thrampoulidis , Samet Oymak , Mahdi Soltanolkotabi

While the traditional viewpoint in machine learning and statistics assumes training and testing samples come from the same population, practice belies this fiction. One strategy -- coming from robust statistics and optimization -- is thus…

Machine Learning · Statistics 2024-07-08 Maxime Cauchois , Suyash Gupta , Alnur Ali , John C. Duchi

Distributionally Robust Supervised Learning (DRSL) is necessary for building reliable machine learning systems. When machine learning is deployed in the real world, its performance can be significantly degraded because test data may follow…

Machine Learning · Statistics 2018-07-24 Weihua Hu , Gang Niu , Issei Sato , Masashi Sugiyama

Randomly censored survival data are frequently encountered in applied sciences including biomedical or reliability applications and clinical trial analyses. Testing the significance of statistical hypotheses is crucial in such analyses to…

Methodology · Statistics 2019-01-08 Abhik Ghosh , Ayanendranath Basu , Leandro Pardo

Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution, but can significantly fail otherwise. Therefore, eliminating the impact of distribution shifts…

Machine Learning · Computer Science 2021-04-19 Xingxuan Zhang , Peng Cui , Renzhe Xu , Linjun Zhou , Yue He , Zheyan Shen
‹ Prev 1 2 3 10 Next ›