Related papers: Multi-objective Semi-supervised Clustering for Fin…

Semi-supervised clustering methods

Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning…

Methodology · Statistics 2014-07-11 Eric Bair

Predictive K-means with local models

Supervised classification can be effective for prediction but sometimes weak on interpretability or explainability (XAI). Clustering, on the other hand, tends to isolate categories or profiles that can be meaningful but there is no…

Machine Learning · Computer Science 2021-04-27 Vincent Lemaire , Oumaima Alaoui Ismaili , Antoine Cornuéjols , Dominique Gay

Fast Randomized Semi-Supervised Clustering

We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items. We introduce an efficient local algorithm based on a power iteration of the non-backtracking…

Machine Learning · Computer Science 2018-06-28 Alaa Saade , Florent Krzakala , Marc Lelarge , Lenka Zdeborová

A Generalized Framework for Predictive Clustering and Optimization

Clustering is a powerful and extensively used data science tool. While clustering is generally thought of as an unsupervised learning technique, there are also supervised variations such as Spath's clusterwise regression that attempt to…

Machine Learning · Computer Science 2023-05-09 Aravinth Chembu , Scott Sanner

Semi-Supervised Classification and Clustering Analysis for Variable Stars

The immense amount of time series data produced by astronomical surveys has called for the use of machine learning algorithms to discover and classify several million celestial sources. In the case of variable stars, supervised learning…

Solar and Stellar Astrophysics · Physics 2022-10-12 R. Pantoja , M. Catelan , K. Pichara , P. Protopapas

Simultaneous semi-parametric estimation of clustering and regression

We investigate the parameter estimation of regression models with fixed group effects, when the group variable is missing while group related variables are available. This problem involves clustering to infer the missing group variable…

Methodology · Statistics 2020-12-29 Matthieu Marbac , Mohammed Sedki , Christophe Biernacki , Vincent Vandewalle

Unsupervised Learning in a General Semiparametric Clusterwise Index Distribution Model

This study introduces a general semiparametric clusterwise index distribution model to analyze how latent clusters affect the covariate-response relationships. By employing sufficient dimension reduction to account for the effects of…

Methodology · Statistics 2025-09-30 Jen-Chieh Teng , Chin-Tsang Chiang

Bi-objective Optimization of Biclustering with Binary Data

Clustering consists of partitioning data objects into subsets called clusters according to some similarity criteria. This paper addresses a generalization called quasi-clustering that allows overlapping of clusters, and which we link to…

Artificial Intelligence · Computer Science 2020-02-13 Fred Glover , Said Hanafi , Gintaras Palubeckis

Semi-supervised Clustering Ensemble by Voting

Clustering ensemble is one of the most recent advances in unsupervised learning. It aims to combine the clustering results obtained using different algorithms or from different runs of the same clustering algorithm for the same data set,…

Machine Learning · Computer Science 2012-08-22 Ashraf Mohammed Iqbal , Abidalrahman Moh'd , Zahoor Khan

Learning with Clustering Structure

We study supervised learning problems using clustering constraints to impose structure on either features or samples, seeking to help both prediction and interpretation. The problem of clustering features arises naturally in text…

Machine Learning · Computer Science 2016-09-20 Vincent Roulet , Fajwel Fogel , Alexandre d'Aspremont , Francis Bach

Semi-Supervised Constrained Clustering: An In-Depth Overview, Ranked Taxonomy and Future Research Directions

Clustering is a well-known unsupervised machine learning approach capable of automatically grouping discrete sets of instances with similar characteristics. Constrained clustering is a semi-supervised extension to this process that can be…

Machine Learning · Computer Science 2023-03-02 Germán González-Almagro , Daniel Peralta , Eli De Poorter , José-Ramón Cano , Salvador García

A Model-based Semi-Supervised Clustering Methodology

We consider an extension of model-based clustering to the semi-supervised case, where some of the data are pre-labeled. We provide a derivation of the Bayesian Information Criterion (BIC) approximation to the Bayes factor in this setting.…

Methodology · Statistics 2016-04-28 Jordan Yoder , Carey E. Priebe

Semi-supervised Learning for Discrete Choice Models

We introduce a semi-supervised discrete choice model to calibrate discrete choice models when relatively few requests have both choice sets and stated preferences but the majority only have the choice sets. Two classic semi-supervised…

Machine Learning · Statistics 2017-02-20 Jie Yang , Sergey Shebalov , Diego Klabjan

Adaptive Clustering through Semidefinite Programming

We analyze the clustering problem through a flexible probabilistic model that aims to identify an optimal partition on the sample X 1 , ..., X n. We perform exact clustering with high probability using a convex semidefinite estimator that…

Statistics Theory · Mathematics 2017-05-19 Martin Royer

Deep Goal-Oriented Clustering

Clustering and prediction are two primary tasks in the fields of unsupervised and supervised learning, respectively. Although much of the recent advances in machine learning have been centered around those two tasks, the interdependent,…

Machine Learning · Computer Science 2020-06-17 Yifeng Shi , Christopher M. Bender , Junier B. Oliva , Marc Niethammer

Large Margin Semi-supervised Structured Output Learning

In structured output learning, obtaining labelled data for real-world applications is usually costly, while unlabelled examples are available in abundance. Semi-supervised structured classification has been developed to handle large amounts…

Machine Learning · Computer Science 2013-11-12 P. Balamurugan , Shirish Shevade , Sundararajan Sellamanickam

Near-Optimal Comparison Based Clustering

The goal of clustering is to group similar objects into meaningful partitions. This process is well understood when an explicit similarity measure between the objects is given. However, far less is known when this information is not readily…

Machine Learning · Computer Science 2020-10-12 Michaël Perrot , Pascal Mattia Esser , Debarghya Ghoshdastidar

Multilinear objective function-based clustering

The input of most clustering algorithms is a symmetric matrix quantifying similarity within data pairs. Such a matrix is here turned into a quadratic set function measuring cluster score or similarity within data subsets larger than pairs.…

Discrete Mathematics · Computer Science 2015-09-30 Giovanni Rossi

Semidefinite programming on population clustering: a global analysis

In this paper, we consider the problem of partitioning a small data sample of size $n$ drawn from a mixture of $2$ sub-gaussian distributions. Our work is motivated by the application of clustering individuals according to their population…

Statistics Theory · Mathematics 2023-01-05 Shuheng Zhou

Large Scale Correlation Clustering Optimization

Clustering is a fundamental task in unsupervised learning. The focus of this paper is the Correlation Clustering functional which combines positive and negative affinities between the data points. The contribution of this paper is two fold:…

Computer Vision and Pattern Recognition · Computer Science 2011-12-14 Shai Bagon , Meirav Galun