Related papers: Permutation tests under a rotating sampling plan w…

A Permutation Test on Complex Sample Data

Permutation tests are a distribution free way of performing hypothesis tests. These tests rely on the condition that the observed data are exchangeable among the groups being tested under the null hypothesis. This assumption is easily…

Methodology · Statistics 2017-12-14 Daniell Toth

Composite empirical likelihood for multisample clustered data

In many applications, data cluster. Failing to take the cluster structure into consideration generally leads to underestimated variances of point estimators and inflated type I errors in hypothesis tests. Many circumstance-dependent…

Methodology · Statistics 2025-07-21 Jiahua Chen , Pengfei Li , Yukun Liu , James V. Zidek

Permutation Inference under Multi-way Clustering and Missing Data

Econometric applications with multi-way clustering often feature a small number of effective clusters or heavy-tailed data, making standard cluster-robust and bootstrap inference unreliable in finite samples. In this paper, we develop a…

Methodology · Statistics 2026-01-14 Wenxuan Guo , Panos Toulis , Yuhao Wang

Permutation-based true discovery proportions for functional Magnetic Resonance Imaging cluster analysis

We propose a permutation-based method for testing a large collection of hypotheses simultaneously. Our method provides lower bounds for the number of true discoveries in any selected subset of hypotheses. These bounds are simultaneously…

Applications · Statistics 2023-01-30 Angela Andreella , Jesse Hemerik , Wouter Weeda , Livio Finos , Jelle Goeman

Nonparametric inference for ratios of densities via uniformly valid and powerful permutation tests

We propose the density ratio permutation test, a hypothesis test that assesses whether the ratio between two densities is proportional to a known function based on independent samples from each distribution. The test uses an efficient…

Methodology · Statistics 2026-01-14 Alberto Bordino , Thomas B. Berrett

The Exchangeability Assumption for Permutation Tests of Multiple Regression Models: Implications for Statistics and Data Science Educators

Permutation tests are a powerful and flexible approach to inference via resampling. As computational methods become more ubiquitous in the statistics curriculum, use of permutation tests has become more tractable. At the heart of the…

Methodology · Statistics 2025-06-09 Johanna Hardin , Lauren Quesada , Julie Ye , Nicholas J. Horton

The recursive scheme of clustering

The problem of data clustering is one of the most important in data analysis. It can be problematic when dealing with experimental data characterized by measurement uncertainties and errors. Our paper proposes a recursive scheme for…

Machine Learning · Computer Science 2024-01-12 Alicja Miniak-Górecka , Krzysztof Podlaski , Tomasz Gwizdałła

Clustering functional data with measurement errors: a simulation-based approach

Clustering analysis of functional data, which comprises observations that evolve continuously over time or space, has gained increasing attention across various scientific disciplines. Practical applications often involve functional data…

Methodology · Statistics 2024-06-19 Tingyu Zhu , Lan Xue , Carmen Tekwe , Keith Diaz , Mark Benden , Roger Zoh

Flexible parametric bootstrap for testing homogeneity against clustering and assessing the number of clusters

There are two notoriously hard problems in cluster analysis, estimating the number of clusters, and checking whether the population to be clustered is not actually homogeneous. Given a dataset, a clustering method and a cluster validation…

Methodology · Statistics 2015-02-10 Christian Hennig , Chien-Ju Lin

Cross-Study Replicability in Cluster Analysis

In cancer research, clustering techniques are widely used for exploratory analyses and dimensionality reduction, playing a critical role in the identification of novel cancer subtypes, often with direct implications for patient management.…

Methodology · Statistics 2023-05-11 Lorenzo Masoero , Emma Thomas , Giovanni Parmigiani , Svitlana Tyekucheva , Lorenzo Trippa

A Model-Based Clustering Approach for Bounded Data Using Transformation-Based Gaussian Mixture Models

The clustering of bounded data presents unique challenges in statistical analysis due to the constraints imposed on the data values. This paper introduces a novel method for model-based clustering specifically designed for bounded data.…

Methodology · Statistics 2025-05-16 Luca Scrucca

Cluster randomized trials designed to support generalizable inferences

Background: When planning a cluster randomized trial, evaluators often have access to an enumerated cohort representing the target population of clusters. Practicalities of conducting the trial, such as the need to oversample clusters with…

Methodology · Statistics 2024-09-19 Sarah E. Robertson , Jon A. Steingrimsson , Issa J. Dahabreh

The Impact of Random Models on Clustering Similarity

Clustering is a central approach for unsupervised learning. After clustering is applied, the most fundamental analysis is to quantitatively compare clusterings. Such comparisons are crucial for the evaluation of clustering methods as well…

Machine Learning · Statistics 2017-10-03 Alexander J Gates , Yong-Yeol Ahn

Skewed Distributions or Transformations? Modelling Skewness for a Cluster Analysis

Because of its mathematical tractability, the Gaussian mixture model holds a special place in the literature for clustering and classification. For all its benefits, however, the Gaussian mixture model poses problems when the data is skewed…

Applications · Statistics 2020-11-19 Michael P. B. Gallaugher , Paul D. McNicholas , Volodymyr Melnykov , Xuwen Zhu

Post-clustering difference testing: valid inference and practical considerations

Clustering is part of unsupervised analysis methods that consist in grouping samples into homogeneous and separate subgroups of observations also called clusters. To interpret the clusters, statistical hypothesis testing is often used to…

Methodology · Statistics 2022-10-25 Benjamin Hivert , Denis Agniel , Rodolphe Thiébaut , Boris P Hejblum

Clustering with Confidence: Finding Clusters with Statistical Guarantees

Clustering is a widely used unsupervised learning method for finding structure in the data. However, the resulting clusters are typically presented without any guarantees on their robustness; slightly changing the used data sample or…

Machine Learning · Statistics 2017-01-02 Andreas Henelius , Kai Puolamäki , Henrik Boström , Panagiotis Papapetrou

A semiparametric model for cluster data

In the analysis of cluster data, the regression coefficients are frequently assumed to be the same across all clusters. This hampers the ability to study the varying impacts of factors on each cluster. In this paper, a semiparametric model…

Statistics Theory · Mathematics 2009-08-25 Wenyang Zhang , Jianqing Fan , Yan Sun

Clustering Data with Nonignorable Missingness using Semi-Parametric Mixture Models

We are concerned in clustering continuous data sets subject to non-ignorable missingness. We perform clustering with a specific semi-parametric mixture, under the assumption of conditional independence given the component. The mixture model…

Methodology · Statistics 2021-07-20 Marie Du Roy de Chaumaray , Matthieu Marbac

Resampling Method For Unsupervised Estimation Of Cluster Validity

We introduce a method for validation of results obtained by clustering analysis of data. The method is based on resampling the available data. A figure of merit that measures the stability of clustering solutions against resampling is…

Computational Physics · Physics 2007-05-23 Erel Levine , Eytan Domany

A general trimming approach to robust Cluster Analysis

We introduce a new method for performing clustering with the aim of fitting clusters with different scatters and weights. It is designed by allowing to handle a proportion $\alpha$ of contaminating data to guarantee the robustness of the…

Statistics Theory · Mathematics 2008-12-18 Luis A. García-Escudero , Alfonso Gordaliza , Carlos Matrán , Agustin Mayo-Iscar