Related papers: Statistical Analysis of Data Repeatability Measure…
The rankability of data is a recently proposed problem that considers the ability of a dataset, represented as a graph, to produce a meaningful ranking of the items it contains. To study this concept, a number of rankability measures have…
Prior proposals for cumulative statistics suggest making tiny random perturbations to the scores (independent variables in a regression) in order to ensure the scores' uniqueness. Uniqueness means that no score for any member of the…
Recent advances have shown that statistical tests for the rank of cross-covariance matrices play an important role in causal discovery. These rank tests include partial correlation tests as special cases and provide further graphical…
Machine learning and deep learning classification models are data-driven, and the model and the data jointly determine their classification performance. It is biased to evaluate the model's performance only based on the classifier accuracy…
Regression evaluation has been performed for decades. Some metrics have been identified to be robust against shifting and scaling of the data but considering the different distributions of data is much more difficult to address (imbalance…
Meta-analysis is routinely performed in many scientific disciplines. This analysis is attractive since discoveries are possible even when all the individual studies are underpowered. However, the meta-analytic discoveries may be entirely…
There is an increasing number of potential biomarkers that could allow for early assessment of treatment response or disease progression. However, measurements of quantitative biomarkers are subject to random variability. Hence, differences…
We develop inference procedures robust to general forms of weak dependence. The procedures utilize test statistics constructed by resampling in a manner that does not depend on the unknown correlation structure of the data. We prove that…
The problem of detecting changes in covariance for a single pair of features has been studied in some detail, but may be limited in importance or general applicability. In contrast, testing equality of covariance matrices of a {\it set} of…
Measurement involves the determination of quantitative estimates of physical quantities from experiment, along with estimates of their associated uncertainties. Herewith an experimental system model is the key to extracting information from…
Many testing problems are readily amenable to randomised tests such as those employing data splitting. However despite their usefulness in principle, randomised tests have obvious drawbacks. Firstly, two analyses of the same dataset may…
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation,…
For a general standardized testing algorithm designed to evaluate a specific aspect of a robot's performance, several key expectations are commonly imposed. Beyond accuracy (i.e., closeness to a typically unknown ground-truth reference) and…
Machine learning classification tasks often benefit from predicting a set of possible labels with confidence scores to capture uncertainty. However, existing methods struggle with the high-dimensional nature of the data and the lack of…
Policy gradient methods in reinforcement learning have become increasingly prevalent for state-of-the-art performance in continuous control tasks. Novel methods typically benchmark against a few key algorithms such as deep deterministic…
Replicability analysis aims to identify the findings that replicated across independent studies that examine the same features. We provide powerful novel replicability analysis procedures for two studies for FWER and for FDR control on the…
With growing applications of Machine Learning (ML) techniques in the real world, it is highly important to ensure that these models work in an equitable manner. One main step in ensuring fairness is to effectively measure fairness, and to…
In two influential contributions, Rosenbaum (2005, 2020) advocated for using the distances between component-wise ranks, instead of the original data values, to measure covariate similarity when constructing matching estimators of average…
Testing the independence between random vectors is a fundamental problem in statistics. Distance correlation, a recently popular dependence measure, is universally consistent for testing independence against all distributions with finite…
How should researchers analyze randomized experiments in which the main outcome is latent and measured in multiple ways but each measure contains some degree of error? We first identify a critical study-specific noncomparability problem in…