Related papers: Clustering processes

Clustering processes

The problem of clustering is considered, for the case when each data point is a sample generated by a stationary ergodic process. We propose a very natural asymptotic notion of consistency, and show that simple consistent algorithms exist,…

Machine Learning · Computer Science 2010-05-31 Daniil Ryabko

Clustering piecewise stationary processes

The problem of time-series clustering is considered in the case where each data-point is a sample generated by a piecewise stationary ergodic process. Stationary processes are perhaps the most general class of processes considered in…

Machine Learning · Statistics 2019-06-27 Azadeh Khaleghi , Daniil Ryabko

Asymptotic nonparametric statistical analysis of stationary time series

Stationarity is a very general, qualitative assumption, that can be assessed on the basis of application specifics. It is thus a rather attractive assumption to base statistical analysis on, especially for problems for which less general…

Statistics Theory · Mathematics 2019-04-02 Daniil Ryabko

A consistent clustering-based approach to estimating the number of change-points in highly dependent time-series

The problem of change-point estimation is considered under a general framework where the data are generated by unknown stationary ergodic process distributions. In this context, the consistent estimation of the number of change-points is…

Machine Learning · Statistics 2013-02-15 Azaden Khaleghi , Daniil Ryabko

Some Developments in Clustering Analysis on Stochastic Processes

We review some developments on clustering stochastic processes and come with the conclusion that asymptotically consistent clustering algorithms can be obtained when the processes are ergodic and the dissimilarity measure satisfies the…

Machine Learning · Statistics 2019-08-07 Qidi Peng , Nan Rao , Ran Zhao

Spontaneous clustering in theoretical and some empirical stationary processes

In a stationary ergodic process, clustering is defined as the tendency of events to appear in series of increased frequency separated by longer breaks. Such behavior, contradicting the theoretical "unbiased behavior" with exponential…

Probability · Mathematics 2008-10-27 Tomasz Downarowicz , Yves Lacroix , Didier Léandri

A Computational Theory and Semi-Supervised Algorithm for Clustering

A computational theory for clustering and a semi-supervised clustering algorithm is presented. Clustering is defined to be the obtainment of groupings of data such that each group contains no anomalies with respect to a chosen grouping…

Machine Learning · Computer Science 2025-07-17 Nassir Mohammad

Clustering with Confidence: Finding Clusters with Statistical Guarantees

Clustering is a widely used unsupervised learning method for finding structure in the data. However, the resulting clusters are typically presented without any guarantees on their robustness; slightly changing the used data sample or…

Machine Learning · Statistics 2017-01-02 Andreas Henelius , Kai Puolamäki , Henrik Boström , Panagiotis Papapetrou

Consistency of spectral clustering

Consistency is a key property of all statistical procedures analyzing randomly sampled data. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of…

Statistics Theory · Mathematics 2008-12-18 Ulrike von Luxburg , Mikhail Belkin , Olivier Bousquet

Stable Chimeras and Independently Synchronizable Clusters

Cluster synchronization is a phenomenon in which a network self-organizes into a pattern of synchronized sets. It has been shown that diverse patterns of stable cluster synchronization can be captured by symmetries of the network. Here we…

Pattern Formation and Solitons · Physics 2017-08-30 Young Sul Cho , Takashi Nishikawa , Adilson E. Motter

Selecting the Number of Clusters $K$ with a Stability Trade-off: an Internal Validation Criterion

Model selection is a major challenge in non-parametric clustering. There is no universally admitted way to evaluate clustering results for the obvious reason that no ground truth is available. The difficulty to find a universal evaluation…

Machine Learning · Computer Science 2023-05-18 Alex Mourer , Florent Forest , Mustapha Lebbah , Hanane Azzag , Jérôme Lacaille

Consensus clustering in complex networks

The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on…

Physics and Society · Physics 2012-03-29 Andrea Lancichinetti , Santo Fortunato

Clustering Stable Instances of Euclidean k-means

The Euclidean k-means problem is arguably the most widely-studied clustering problem in machine learning. While the k-means objective is NP-hard in the worst-case, practitioners have enjoyed remarkable success in applying heuristics like…

Machine Learning · Computer Science 2017-12-05 Abhratanu Dutta , Aravindan Vijayaraghavan , Alex Wang

Clustering -- Basic concepts and methods

We review clustering as an analysis tool and the underlying concepts from an introductory perspective. What is clustering and how can clusterings be realised programmatically? How can data be represented and prepared for a clustering task?…

Machine Learning · Computer Science 2022-12-05 Jan-Oliver Felix Kapp-Joswig , Bettina G. Keller

Demystifying Information-Theoretic Clustering

We propose a novel method for clustering data which is grounded in information-theoretic principles and requires no parametric assumptions. Previous attempts to use information theory to define clusters in an assumption-free way are based…

Machine Learning · Computer Science 2014-02-07 Greg Ver Steeg , Aram Galstyan , Fei Sha , Simon DeDeo

Clustering with Label Consistency

Designing efficient, effective, and consistent metric clustering algorithms is a significant challenge attracting growing attention. Traditional approaches focus on the stability of cluster centers; unfortunately, this neglects the…

Data Structures and Algorithms · Computer Science 2025-12-23 Diptarka Chakraborty , Hendrik Fichtenberger , Bernhard Haeupler , Silvio Lattanzi , Ashkan Norouzi-Fard , Ola Svensson

Consensus Clustering: An Embedding Perspective, Extension and Beyond

Consensus clustering fuses diverse basic partitions (i.e., clustering results obtained from conventional clustering methods) into an integrated one, which has attracted increasing attention in both academic and industrial areas due to its…

Machine Learning · Computer Science 2019-06-04 Hongfu Liu , Zhiqiang Tao , Zhengming Ding

Clustering Stability: An Overview

A popular method for selecting the number of clusters is based on stability arguments: one chooses the number of clusters such that the corresponding clustering results are "most stable". In recent years, a series of papers has analyzed the…

Machine Learning · Statistics 2010-07-08 Ulrike von Luxburg

Information based clustering

In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial…

Quantitative Methods · Quantitative Biology 2009-11-11 Noam Slonim , Gurinder Singh Atwal , Gasper Tkacik , William Bialek

Consistency and Inconsistency in $K$-Means Clustering

A celebrated result of Pollard proves asymptotic consistency for $k$-means clustering when the population distribution has finite variance. In this work, we point out that the population-level $k$-means clustering problem is, in fact,…

Statistics Theory · Mathematics 2025-07-09 Moïse Blanchard , Adam Quinn Jaffe , Nikita Zhivotovskiy