English
Related papers

Related papers: Isotropic PCA and Affine-Invariant Clustering

200 papers

We give an efficient algorithm for robustly clustering of a mixture of two arbitrary Gaussians, a central open problem in the theory of computationally efficient robust estimation, assuming only that the the means of the component Gaussians…

Data Structures and Algorithms · Computer Science 2020-06-02 He Jia , Santosh Vempala

This paper considers a canonical clustering problem where one receives unlabeled samples drawn from a balanced mixture of two elliptical distributions and aims for a classifier to estimate the labels. Many popular methods including PCA and…

Machine Learning · Statistics 2021-11-30 Kaizheng Wang , Yuling Yan , Mateo Díaz

In several application domains, high-dimensional observations are collected and then analysed in search for naturally occurring data clusters which might provide further insights about the nature of the problem. In this paper we describe a…

Machine Learning · Statistics 2012-03-07 Brian McWilliams , Giovanni Montana

We consider the problem of outlier robust PCA (OR-PCA) where the goal is to recover principal directions despite the presence of outlier data points. That is, given a data matrix $M^*$, where $(1-\alpha)$ fraction of the points are noisy…

Machine Learning · Computer Science 2017-02-21 Yeshwanth Cherapanamjeri , Prateek Jain , Praneeth Netrapalli

We propose a spectral clustering method based on local principal components analysis (PCA). After performing local PCA in selected neighborhoods, the algorithm builds a nearest neighbor graph weighted according to a discrepancy between the…

Machine Learning · Statistics 2019-04-09 Ery Arias-Castro , Gilad Lerman , Teng Zhang

Fourier PCA is Principal Component Analysis of a matrix obtained from higher order derivatives of the logarithm of the Fourier transform of a distribution.We make this method algorithmic by developing a tensor decomposition method for a…

Machine Learning · Computer Science 2014-07-01 Navin Goyal , Santosh Vempala , Ying Xiao

We study the clustering problem for mixtures of bounded covariance distributions, under a fine-grained separation assumption. Specifically, given samples from a $k$-component mixture distribution $D = \sum_{i =1}^k w_i P_i$, where each $w_i…

Machine Learning · Computer Science 2023-12-20 Ilias Diakonikolas , Daniel M. Kane , Jasper C. H. Lee , Thanasis Pittas

In this contribution, the clustering procedure based on K-Means algorithm is studied as an inverse problem, which is a special case of the illposed problems. The attempts to improve the quality of the clustering inverse problem drive to…

Numerical Analysis · Mathematics 2022-11-16 Alberto Arturo Vergani

In order to identify clusters of objects with features transformed by unknown affine transformations, we develop a Bayesian cluster process which is invariant with respect to certain linear transformations of the feature space and able to…

Methodology · Statistics 2016-12-01 Hsin-Hsiung Huang , Jie Yang

Principal component analysis (PCA) is an important tool in exploring data. The conventional approach to PCA leads to a solution which favours the structures with large variances. This is sensitive to outliers and could obfuscate interesting…

Methodology · Statistics 2015-06-16 A. A. Akinduko , A. N. Gorban

Distributed algorithms and theories are called for in this era of big data. Under weaker local signal-to-noise ratios, we improve upon the celebrated one-round distributed principal component analysis (PCA) algorithm designed in the spirit…

Methodology · Statistics 2025-07-01 ZeYu Li , Xinsheng Zhang , Wang Zhou

We consider a clustering problem where we observe feature vectors $X_i \in R^p$, $i = 1, 2, \ldots, n$, from $K$ possible classes. The class labels are unknown and the main interest is to estimate them. We are primarily interested in the…

Methodology · Statistics 2015-12-17 Jiashun Jin , Wanjie Wang

Principal Component Analysis (PCA) is a workhorse of modern data science. While PCA assumes the data conforms to Euclidean geometry, for specific data types, such as hierarchical and cyclic data structures, other spaces are more…

Machine Learning · Statistics 2024-07-11 Puoya Tabaghi , Michael Khanzadeh , Yusu Wang , Sivash Mirarab

We study the efficient learnability of high-dimensional Gaussian mixtures in the outlier-robust setting, where a small constant fraction of the data is adversarially corrupted. We resolve the polynomial learnability of this problem when the…

Data Structures and Algorithms · Computer Science 2020-05-14 Ilias Diakonikolas , Samuel B. Hopkins , Daniel Kane , Sushrut Karmalkar

Distributed computing is a standard way to scale up machine learning and data science algorithms to process large amounts of data. In such settings, avoiding communication amongst machines is paramount for achieving high performance. Rather…

Machine Learning · Statistics 2021-05-04 Vasileios Charisopoulos , Austin R. Benson , Anil Damle

We present an efficient and accurate algorithm for principal component analysis (PCA) of a large set of two dimensional images, and, for each image, the set of its uniform rotations in the plane and its reflection. The algorithm starts by…

Computer Vision and Pattern Recognition · Computer Science 2014-02-17 Zhizhen Zhao , Amit Singer

Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance-correlation matrix of the analyzed data. However to properly work with high-dimensional data, PCA poses severe mathematical…

Quantitative Methods · Quantitative Biology 2018-10-18 Luigi Leonardo Palese

Algorithmic fairness in clustering aims to balance the proportions of instances assigned to each cluster with respect to a given sensitive attribute. While recently developed fair clustering algorithms optimize clustering objectives under…

Machine Learning · Computer Science 2025-10-24 Kunwoong Kim , Jihu Lee , Sangchul Park , Yongdai Kim

Recent advances in image clustering typically focus on learning better deep representations. In contrast, we present an orthogonal approach that does not rely on abstract features but instead learns to predict image transformations and…

Computer Vision and Pattern Recognition · Computer Science 2020-10-29 Tom Monnier , Thibault Groueix , Mathieu Aubry

Many clustering problems in computer vision and other contexts are also classification problems, where each cluster shares a meaningful label. Subspace clustering algorithms in particular are often applied to problems that fit this…

Machine Learning · Computer Science 2017-09-15 John Lipor , Laura Balzano
‹ Prev 1 2 3 10 Next ›