Related papers: Fast Randomized Semi-Supervised Clustering

A semi-supervised sparse K-Means algorithm

We consider the problem of data clustering with unidentified feature quality and when a small amount of labelled data is provided. An unsupervised sparse clustering method can be employed in order to detect the subgroup of features…

Machine Learning · Computer Science 2020-10-20 Avgoustinos Vouros , Eleni Vasilaki

Graph-based Semi-supervised Local Clustering with Few Labeled Nodes

Local clustering aims at extracting a local structure inside a graph without the necessity of knowing the entire graph structure. As the local structure is usually small in size compared to the entire graph, one can think of it as a…

Machine Learning · Computer Science 2024-08-20 Zhaiming Shen , Ming-Jun Lai , Sheng Li

Semi-Supervised Clustering with Neural Networks

Clustering using neural networks has recently demonstrated promising performance in machine learning and computer vision applications. However, the performance of current approaches is limited either by unsupervised learning or their…

Machine Learning · Computer Science 2018-07-11 Ankita Shukla , Gullal Singh Cheema , Saket Anand

Semi-supervised Classification: Cluster and Label Approach using Particle Swarm Optimization

Classification predicts classes of objects using the knowledge learned during the training phase. This process requires learning from labeled samples. However, the labeled samples usually limited. Annotation process is annoying, tedious,…

Machine Learning · Computer Science 2017-06-06 Shahira Shaaban Azab , Mohamed Farouk Abdel Hady , Hesham Ahmed Hefny

Multi-objective Semi-supervised Clustering for Finding Predictive Clusters

This study concentrates on clustering problems and aims to find compact clusters that are informative regarding the outcome variable. The main goal is partitioning data points so that observations in each cluster are similar and the outcome…

Neural and Evolutionary Computing · Computer Science 2022-01-27 Zahra Ghasemi , Hadi Akbarzadeh Khorshidi , Uwe Aickelin

Active clustering for labeling training data

Gathering training data is a key step of any supervised learning task, and it is both critical and expensive. Critical, because the quantity and quality of the training data has a high impact on the performance of the learned function.…

Data Structures and Algorithms · Computer Science 2021-10-28 Quentin Lutz , Élie de Panafieu , Alex Scott , Maya Stein

Clustering from Sparse Pairwise Measurements

We consider the problem of grouping items into clusters based on few random pairwise comparisons between the items. We introduce three closely related algorithms for this task: a belief propagation algorithm approximating the Bayes optimal…

Social and Information Networks · Computer Science 2016-08-26 Alaa Saade , Marc Lelarge , Florent Krzakala , Lenka Zdeborová

Correlation Clustering with Low-Rank Matrices

Correlation clustering is a technique for aggregating data based on qualitative information about which pairs of objects are labeled 'similar' or 'dissimilar.' Because the optimization problem is NP-hard, much of the previous literature…

Machine Learning · Computer Science 2017-03-20 Nate Veldt , Anthony Wirth , David F. Gleich

Efficient Clustering with Limited Distance Information

Given a point set S and an unknown metric d on S, we study the problem of efficiently partitioning S into k clusters while querying few distances between the points. In our model we assume that we have access to one versus all queries that…

Data Structures and Algorithms · Computer Science 2011-05-10 Konstantin Voevodski , Maria-Florina Balcan , Heiko Roglin , Shang-Hua Teng , Yu Xia

Semidefinite programming on population clustering: a global analysis

In this paper, we consider the problem of partitioning a small data sample of size $n$ drawn from a mixture of $2$ sub-gaussian distributions. Our work is motivated by the application of clustering individuals according to their population…

Statistics Theory · Mathematics 2023-01-05 Shuheng Zhou

A parallel sampling based clustering

The problem of automatically clustering data is an age old problem. People have created numerous algorithms to tackle this problem. The execution time of any of this algorithm grows with the number of input points and the number of cluster…

Machine Learning · Computer Science 2014-12-08 Aditya AV Sastry , Kalyan Netti

A Semi-Supervised Self-Organizing Map for Clustering and Classification

There has been an increasing interest in semi-supervised learning in the recent years because of the great number of datasets with a large number of unlabeled data but only a few labeled samples. Semi-supervised learning algorithms can work…

Machine Learning · Computer Science 2020-03-26 Pedro H. M. Braga , Hansenclever F. Bassani

Fast model-based clustering of partial records

Partially recorded data are frequently encountered in many applications and usually clustered by first removing incomplete cases or features with missing values, or by imputing missing values, followed by application of a clustering…

Methodology · Statistics 2021-10-20 Emily M. Goren , Ranjan Maitra

Semi-Supervised and Active Few-Shot Learning with Prototypical Networks

We consider the problem of semi-supervised few-shot classification where a classifier needs to adapt to new tasks using a few labeled examples and (potentially many) unlabeled examples. We propose a clustering approach to the problem. The…

Machine Learning · Computer Science 2018-04-26 Rinu Boney , Alexander Ilin

Semi-supervised clustering methods

Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning…

Methodology · Statistics 2014-07-11 Eric Bair

Semantic-Anchored, Class Variance-Optimized Clustering for Robust Semi-Supervised Few-Shot Learning

Few-shot learning has been extensively explored to address problems where the amount of labeled samples is very limited for some classes. In the semi-supervised few-shot learning setting, substantial quantities of unlabeled samples are…

Computer Vision and Pattern Recognition · Computer Science 2025-12-16 Souvik Maji , Rhythm Baghel , Pratik Mazumder

Efficient Clustering with Limited Distance Information

Given a point set S and an unknown metric d on S, we study the problem of efficiently partitioning S into k clusters while querying few distances between the points. In our model we assume that we have access to one versus all queries that…

Machine Learning · Computer Science 2014-08-12 Konstantin Voevodski , Maria-Florina Balcan , Heiko Roglin , Shang-Hua Teng , Yu Xia

Semi-supervised K-means++

Traditionally, practitioners initialize the {\tt k-means} algorithm with centers chosen uniformly at random. Randomized initialization with uneven weights ({\tt k-means++}) has recently been used to improve the performance over this…

Machine Learning · Statistics 2016-02-02 Jordan Yoder , Carey E. Priebe

Advancing Local Clustering on Graphs via Compressive Sensing: Semi-supervised and Unsupervised Methods

Local clustering aims to identify specific substructures within a large graph without any additional structural information of the graph. These substructures are typically small compared to the overall graph, enabling the problem to be…

Machine Learning · Computer Science 2025-10-31 Zhaiming Shen , Sung Ha Kang

Distributed Lance-William Clustering Algorithm

One important tool is the optimal clustering of data into useful categories. Dividing similar objects into a smaller number of clusters is of importance in many applications. These include search engines, monitoring of academic performance,…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-21 Gavriel Yarmish , Philip Listowsky , Simon Dexter