English
Related papers

Related papers: Using Data Compressors to Construct Rank Tests

200 papers

Random Number Generators play a critical role in a number of important applications. In practice, statistical testing is employed to gather evidence that a generator indeed produces numbers that appear to be random. In this paper, we…

Computational Complexity · Computer Science 2010-03-25 Weiling Chang , Binxing Fang , Xiaochun Yun , Shupeng Wang , Xiangzhan Yu

Detecting and locating changes in highly multivariate data is a major concern in several current statistical applications. In this context, the first contribution of the paper is a novel non-parametric two-sample homogeneity test for…

Statistics Theory · Mathematics 2012-02-13 Alexandre Lung-Yut-Fong , Céline Lévy-Leduc , Olivier Cappé

A massive dataset often consists of a growing number of (potentially) heterogeneous sub-populations. This paper is concerned about testing various forms of heterogeneity arising from massive data. In a general nonparametric framework, a set…

Statistics Theory · Mathematics 2016-01-26 Junwei Lu , Guang Cheng , Han Liu

In this paper we develop a novel nonparametric framework to test the independence of two random variables $\mathbf{X}$ and $\mathbf{Y}$ with unknown respective marginals $H(dx)$ and $G(dy)$ and joint distribution $F(dx dy)$, based on {\it…

Statistics Theory · Mathematics 2024-03-20 Myrto Limnios , Stéphan Clémençon

In this paper, we consider sequential testing over a single-sensor, a single-decision center setup. At each time instant $t$, the sensor gets $k$ samples $(k>0)$ and describes the observed sequence until time $t$ to the decision center over…

Information Theory · Computer Science 2021-09-20 Sadaf Salehkalaibar , Vincent Y. F. Tan

In this paper, we consider testing the homogeneity for proportions in independent binomial distributions especially when data are sparse for large number of groups. We provide broad aspects of our proposed tests such as theoretical studies,…

Statistics Theory · Mathematics 2017-12-14 Junyong Park

Compositional data (i.e., data comprising random variables that sum up to a constant) arises in many applications including microbiome studies, chemical ecology, political science, and experimental designs. Yet when compositional data serve…

Methodology · Statistics 2025-01-03 Ritwik Bhaduri , Siyuan Ma , Lucas Janson

Data depth has been applied as a nonparametric measurement for ranking multivariate samples. In this paper, we focus on homogeneity tests to assess whether two multivariate samples are from the same distribution. There are many data…

Statistics Theory · Mathematics 2023-06-09 Yiting Chen , Wei Lin , Xiaoping Shi

We introduce a general non-parametric independence test between right-censored survival times and covariates, which may be multivariate. Our test statistic has a dual interpretation, first in terms of the supremum of a potentially infinite…

Methodology · Statistics 2021-11-23 Tamara Fernandez , Arthur Gretton , David Rindt , Dino Sejdinovic

In this article, we study the test for independence of two random elements $X$ and $Y$ lying in an infinite dimensional space ${\cal{H}}$ (specifically, a real separable Hilbert space equipped with the inner product $\langle .,…

Statistics Theory · Mathematics 2024-10-15 Suprio Bhar , Subhra Sankar Dhar

Nonparametric two-sample testing is a classical problem in inferential statistics. While modern two-sample tests, such as the edge count test and its variants, can handle multivariate and non-Euclidean data, contemporary gargantuan datasets…

Methodology · Statistics 2023-04-28 Trambak Banerjee , Bhaswar B. Bhattacharya , Gourab Mukherjee

Intuitively, if a density operator has small rank, then it should be easier to estimate from experimental data, since in this case only a few eigenvectors need to be learned. We prove two complementary results that confirm this intuition.…

Quantum Physics · Physics 2012-10-18 Steven T. Flammia , David Gross , Yi-Kai Liu , Jens Eisert

In this paper we propose several variants to perform the independence test between two random elements based on recurrence rates. We will show how to calculate the test statistic in each one of these cases. From simulations we obtain that…

Methodology · Statistics 2020-09-21 Juan Kalemkerian , Diego Fernández

Compression of integer sets and sequences has been extensively studied for settings where elements follow a uniform probability distribution. In addition, methods exist that exploit clustering of elements in order to achieve higher…

Information Theory · Computer Science 2014-02-11 N. Jesper Larsson

Traditionally, data compression deals with the problem of concisely representing a data source, e.g. a sequence of letters, for the purpose of eventual reproduction (either exact or approximate). In this work we are interested in the case…

Information Theory · Computer Science 2013-12-10 Amir Ingber , Tsachy Weissman

Tensor decomposition on big data has attracted significant attention recently. Among the most popular methods is a class of algorithms that leverages compression in order to reduce the size of the tensor and potentially parallelize…

Machine Learning · Computer Science 2018-11-20 Georgios Tsitsikas , Evangelos E. Papalexakis

A framework is developed using techniques from rate distortion theory in statistical testing. The idea is first to do optimal compression according to a certain distortion function and then use information divergence from the compressed…

Information Theory · Computer Science 2009-04-01 Peter Harremoes

Rank correlations have found many innovative applications in the last decade. In particular, suitable rank correlations have been used for consistent tests of independence between pairs of random variables. Using ranks is especially…

Statistics Theory · Mathematics 2021-05-04 Hongjian Shi , Marc Hallin , Mathias Drton , Fang Han

We investigate one/two-sample mean tests for high-dimensional compositional data when the number of variables is comparable with the sample size, as commonly encountered in microbiome research. Existing methods mainly focus on max-type test…

Statistics Theory · Mathematics 2024-04-15 Qianqian Jiang , Wenbo Li , Zeng Li

Compression aims to reduce the size of an input, while maintaining its relevant properties. For multi-parameter persistent homology, compression is a necessary step in any computational pipeline, since standard constructions lead to large…

Algebraic Topology · Mathematics 2022-08-17 Ulderico Fugacci , Michael Kerber , Alexander Rolle
‹ Prev 1 2 3 10 Next ›