English
Related papers

Related papers: Accuracy and Robustness of Clustering Algorithms f…

200 papers

One of the most widely used techniques for data clustering is agglomerative clustering. Such algorithms have been long used across many different fields ranging from computational biology to social sciences to computer vision in part…

Machine Learning · Computer Science 2014-07-15 Maria-Florina Balcan , Yingyu Liang , Pramod Gupta

Clustering is a fundamental data mining tool that aims to divide data into groups of similar items. Generally, intuition about clustering reflects the ideal case -- exact data sets endowed with flawless dissimilarity between individual…

Machine Learning · Computer Science 2016-01-25 Margareta Ackerman , Jarrod Moore

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…

Data Structures and Algorithms · Computer Science 2015-12-01 Ka-Chun Wong

Recently, there has been substantial interest in clustering research that takes a beyond worst-case approach to the analysis of algorithms. The typical idea is to design a clustering algorithm that outputs a near-optimal solution, provided…

Data Structures and Algorithms · Computer Science 2018-12-31 Maria-Florina Balcan , Colin White

Clustering is a widely used unsupervised learning method for finding structure in the data. However, the resulting clusters are typically presented without any guarantees on their robustness; slightly changing the used data sample or…

Machine Learning · Statistics 2017-01-02 Andreas Henelius , Kai Puolamäki , Henrik Boström , Panagiotis Papapetrou

A major challenge in cluster analysis is that the number of data clusters is mostly unknown and it must be estimated prior to clustering the observed data. In real-world applications, the observed data is often subject to heavy tailed noise…

Machine Learning · Statistics 2020-05-06 Freweyni K. Teklehaymanot , Michael Muma , Abdelhak M. Zoubir

We study the canonical fair clustering problem where each cluster is constrained to have close to population-level representation of each group. Despite significant attention, the salient issue of having incomplete knowledge about the group…

Machine Learning · Computer Science 2024-11-21 Sharmila Duppala , Juan Luque , John P. Dickerson , Seyed A. Esmaeili

One basic requirement of many studies is the necessity of classifying data. Clustering is a proposed method for summarizing networks. Clustering methods can be divided into two categories named model-based approaches and algorithmic…

Machine Learning · Computer Science 2013-02-19 Raheleh Namayandeh , Farzad Didehvar , Zahra Shojaei

Machine learning systems increasingly depend on pipelines of multiple algorithms to provide high quality and well structured predictions. This paper argues interaction effects between clustering and prediction (e.g. classification,…

Machine Learning · Statistics 2019-01-01 Matt Barnes , Artur Dubrawski

In this paper we introduce two procedures for variable selection in cluster analysis and classification rules. One is mainly oriented to detect the noisy non-informative variables, while the other deals also with multicolinearity. A…

Statistics Theory · Mathematics 2023-12-29 Ricardo Fraiman , Ana Justel , Marcela Svarc

In empirical work it is common to estimate parameters of models and report associated standard errors that account for "clustering" of units, where clusters are defined by factors such as geography. Clustering adjustments are typically…

Statistics Theory · Mathematics 2022-09-21 Alberto Abadie , Susan Athey , Guido Imbens , Jeffrey Wooldridge

Clustering analysis is one of the most widely used statistical tools in many emerging areas such as microarray data analysis. For microarray and other high-dimensional data, the presence of many noise variables may mask underlying…

Machine Learning · Statistics 2008-03-26 Benhuai Xie , Wei Pan , Xiaotong Shen

Recently, sparse subspace clustering has been a valid tool to deal with high-dimensional data. There are two essential steps in the framework of sparse subspace clustering. One is solving the coefficient matrix of data, and the other is…

Computer Vision and Pattern Recognition · Computer Science 2019-12-24 Wen-Jin Fu , Xiao-Jun Wu , He-Feng Yin , Wen-Bo Hu

Performance of clustering algorithms is evaluated with the help of accuracy metrics. There is a great diversity of clustering algorithms, which are key components of many data analysis and exploration systems. However, there exist only few…

Data Structures and Algorithms · Computer Science 2019-02-18 Artem Lutov , Mourad Khayati , Philippe Cudré-Mauroux

High-fidelity measurements are important for the physical implementation of quantum information protocols. Current methods for classifying measurement trajectories in superconducting qubit systems produce fidelities that are systematically…

Quantum Physics · Physics 2015-05-27 Easwar Magesan , Jay M. Gambetta , A. D. Córcoles , Jerry M. Chow

Given full or partial information about a collection of points that lie close to a union of several subspaces, subspace clustering refers to the process of clustering the points according to their subspace and identifying the subspaces. One…

Machine Learning · Statistics 2018-01-16 Zachary Charles , Amin Jalali , Rebecca Willett

Metric based comparison operations such as finding maximum, nearest and farthest neighbor are fundamental to studying various clustering techniques such as $k$-center clustering and agglomerative hierarchical clustering. These techniques…

Data Structures and Algorithms · Computer Science 2021-05-13 Raghavendra Addanki , Sainyam Galhotra , Barna Saha

In most practical problems of classifier learning, the training data suffers from the label noise. Hence, it is important to understand how robust is a learning algorithm to such label noise. This paper presents some theoretical analysis to…

Machine Learning · Computer Science 2016-08-29 Aritra Ghosh , Naresh Manwani , P. S. Sastry

In this work a robust clustering algorithm for stationary time series is proposed. The algorithm is based on the use of estimated spectral densities, which are considered as functional data, as the basic characteristic of stationary time…

Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…

Machine Learning · Computer Science 2024-01-17 Hui Yin , Amir Aryani , Stephen Petrie , Aishwarya Nambissan , Aland Astudillo , Shengyuan Cao
‹ Prev 1 2 3 10 Next ›