Related papers: Binary Classification Tests, Imperfect Standards, …
Binary classification is a task that involves the classification of data into one of two distinct classes. It is widely utilized in various fields. However, conventional classifiers tend to make overconfident predictions for data that…
In supervised learning, we often face with ambiguous (A) samples that are difficult to label even by domain experts. In this paper, we consider a binary classification problem in the presence of such A samples. This problem is substantially…
Gathering observational data for medical decision-making often involves uncertainties arising from both type I (false positive)and type II (false negative) errors. In this work, we develop a statistical model to study how medical…
Medical researchers have solved the problem of estimating the sensitivity and specificity of binary medical diagnostic tests without gold standard tests for comparison. That problem is the same as estimating confusion matrices for…
Formulating accurate and robust classification strategies is a key challenge of developing diagnostic and antibody tests. Methods that do not explicitly account for disease prevalence and uncertainty therein can lead to significant…
Sharing medical reports is essential for patient-centered care. A recent line of work has focused on automatically generating reports with NLP methods. However, different audiences have different purposes when writing/reading medical…
We investigate the sample complexity of mutual information and conditional mutual information testing. For conditional mutual information testing, given access to independent samples of a triple of random variables $(A, B, C)$ with unknown…
In many biological networks the responses of individual elements are ambiguous. We consider a scenario in which many sensors respond to a shared signal, each with limited information capacity, and ask that the outputs together convey as…
Debunking misinformation is an important and time-critical task as there could be adverse consequences when misinformation is not quashed promptly. However, the usual supervised approach to debunking via misinformation classification…
Informally, "Information Inconsistency" is the property that has been observed in many Bayesian hypothesis testing and model selection procedures whereby the Bayesian conclusion does not become definitive when the data seems to become…
Diagnostic tests play a crucial role in medical care. Thus any new diagnostic tests must undergo a thorough evaluation. New diagnostic tests are evaluated in comparison with the respective gold standard tests. The performance of binary…
This paper resolves two open problems from a recent paper, arXiv:2403.16981, concerning the sample complexity of distributed simple binary hypothesis testing under information constraints. The first open problem asks whether interaction…
Hypothesis testing in singular statistical models is often regarded as inherently problematic due to non-identifiability and degeneracy of the Fisher information. We show that the fundamental obstruction to testing in such models is not…
A long noted difficulty when assessing the reliability (or calibration) of forecasting systems is that reliability, in general, is a hypothesis not about a finite dimensional parameter but about an entire functional relationship. A…
External information, such as prior information or expert opinions, can play an important role in the design, analysis and interpretation of clinical trials. However, little attention has been devoted thus far to incorporating external…
Consider the problem where a statistician in a two-node system receives rate-limited information from a transmitter about marginal observations of a memoryless process generated from two possible distributions. Using its own observations,…
The main purpose of this paper is to present new families of test statistics for studying the problem of goodness-of-fit of some data to a latent class model for binary data. The families of test statistics introduced are based on…
Many practical studies rely on hypothesis testing procedures applied to data sets with missing information. An important part of the analysis is to determine the impact of the missing data on the performance of the test, and this can be…
Age dependent performance disparities in medical image classification often arise because age acts as a confounder, linking imaging morphology with disease prevalence. In practice, disparities can manifest as overdiagnosis at ages where…
The problem of binary hypothesis testing between two probability measures is considered. New sharp bounds are derived for the best achievable error probability of such tests based on independent and identically distributed observations.…