Related papers: A Model-based Semi-Supervised Clustering Methodolo…

Semi-supervised Learning for Discrete Choice Models

We introduce a semi-supervised discrete choice model to calibrate discrete choice models when relatively few requests have both choice sets and stated preferences but the majority only have the choice sets. Two classic semi-supervised…

Machine Learning · Statistics 2017-02-20 Jie Yang , Sergey Shebalov , Diego Klabjan

Bayesian Cluster Enumeration Criterion for Unsupervised Learning

We derive a new Bayesian Information Criterion (BIC) by formulating the problem of estimating the number of clusters in an observed data set as maximization of the posterior probability of the candidate models. Given that some mild…

Statistics Theory · Mathematics 2018-08-28 Freweyni K. Teklehaymanot , Michael Muma , Abdelhak M. Zoubir

Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input.…

Machine Learning · Computer Science 2013-09-27 Amar Shah , Zoubin Ghahramani

Semi-supervised cross-entropy clustering with information bottleneck constraint

In this paper, we propose a semi-supervised clustering method, CEC-IB, that models data with a set of Gaussian distributions and that retrieves clusters based on a partial labeling provided by the user (partition-level side information). By…

Machine Learning · Computer Science 2017-11-15 Marek Śmieja , Bernhard C. Geiger

Bayesian Bi-clustering Methods with Applications in Computational Biology

Bi-clustering is a useful approach in analyzing biological data when observations come from heterogeneous groups and have a large number of features. We outline a general Bayesian approach in tackling bi-clustering problems in moderate to…

Applications · Statistics 2021-02-11 Han Yan , Jiexing Wu , Yang Li , Jun S. Liu

Semi-Supervised and Active Few-Shot Learning with Prototypical Networks

We consider the problem of semi-supervised few-shot classification where a classifier needs to adapt to new tasks using a few labeled examples and (potentially many) unlabeled examples. We propose a clustering approach to the problem. The…

Machine Learning · Computer Science 2018-04-26 Rinu Boney , Alexander Ilin

Bayesian Semi-supervised Multi-category Classification under Nonparanormality

Semi-supervised learning is a model training method that uses both labeled and unlabeled data. This paper proposes a fully Bayes semi-supervised learning algorithm that can be applied to any multi-category classification problem. We assume…

Machine Learning · Statistics 2024-07-22 Rui Zhu , Shuvrarghya Ghosh , Subhashis Ghosal

A model selection approach for clustering a multinomial sequence with non-negative factorization

We consider a problem of clustering a sequence of multinomial observations by way of a model selection criterion. We propose a form of a penalty term for the model selection procedure. Our approach subsumes both the conventional AIC and BIC…

Machine Learning · Statistics 2015-08-17 Nam H. Lee , Runze Tang , Carey E. Priebe , Michael Rosen

Semi-supervised clustering methods

Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning…

Methodology · Statistics 2014-07-11 Eric Bair

Robust Bayesian Model Selection for Variable Clustering with the Gaussian Graphical Model

Variable clustering is important for explanatory analysis. However, only few dedicated methods for variable clustering with the Gaussian graphical model have been proposed. Even more severe, small insignificant partial correlations due to…

Applications · Statistics 2018-06-18 Daniel Andrade , Akiko Takeda , Kenji Fukumizu

Adaptive Semisupervised Inference

Semisupervised methods inevitably invoke some assumption that links the marginal distribution of the features to the regression function of the label. Most commonly, the cluster or manifold assumptions are used which imply that the…

Statistics Theory · Mathematics 2011-12-02 Martin Azizyan , Aarti Singh , Larry Wasserman

Towards Bayesian Data Selection

A wide range of machine learning algorithms iteratively add data to the training sample. Examples include semi-supervised learning, active learning, multi-armed bandits, and Bayesian optimization. We embed this kind of data addition into…

Machine Learning · Statistics 2024-06-25 Julian Rodemann

A Privacy-Aware Bayesian Approach for Combining Classifier and Cluster Ensembles

This paper introduces a privacy-aware Bayesian approach that combines ensembles of classifiers and clusterers to perform semi-supervised and transductive learning. We consider scenarios where instances and their classification/clustering…

Machine Learning · Computer Science 2012-04-23 Ayan Acharya , Eduardo R. Hruschka , Joydeep Ghosh

Semi-Supervised Information-Maximization Clustering

Semi-supervised clustering aims to introduce prior knowledge in the decision process of a clustering algorithm. In this paper, we propose a novel semi-supervised clustering algorithm based on the information-maximization principle. The…

Machine Learning · Computer Science 2013-05-02 Daniele Calandriello , Gang Niu , Masashi Sugiyama

A Bayesian Approach to Restricted Latent Class Models for Scientifically-Structured Clustering of Multivariate Binary Outcomes

In this paper, we propose a general framework for combining evidence of varying quality to estimate underlying binary latent variables in the presence of restrictions imposed to respect the scientific context. The resulting algorithms…

Methodology · Statistics 2018-08-28 Zhenke Wu , Livia Casciola-Rosen , Antony Rosen , Scott L. Zeger

Clustering Multivariate Data using Factor Analytic Bayesian Mixtures with an Unknown Number of Components

Recent work on overfitting Bayesian mixtures of distributions offers a powerful framework for clustering multivariate data using a latent Gaussian model which resembles the factor analysis model. The flexibility provided by overfitting…

Methodology · Statistics 2019-08-29 Panagiotis Papastamoulis

Multi-objective Semi-supervised Clustering for Finding Predictive Clusters

This study concentrates on clustering problems and aims to find compact clusters that are informative regarding the outcome variable. The main goal is partitioning data points so that observations in each cluster are similar and the outcome…

Neural and Evolutionary Computing · Computer Science 2022-01-27 Zahra Ghasemi , Hadi Akbarzadeh Khorshidi , Uwe Aickelin

Constraint-Based Clustering Selection

Semi-supervised clustering methods incorporate a limited amount of supervision into the clustering process. Typically, this supervision is provided by the user in the form of pairwise constraints. Existing methods use such constraints in…

Machine Learning · Statistics 2016-09-26 Toon Van Craenendonck , Hendrik Blockeel

Probabilistic Combination of Classifier and Cluster Ensembles for Non-transductive Learning

Unsupervised models can provide supplementary soft constraints to help classify new target data under the assumption that similar objects in the target set are more likely to share the same class label. Such models can also help detect…

Machine Learning · Computer Science 2015-03-13 Ayan Acharya , Eduardo R. Hruschka , Joydeep Ghosh , Badrul Sarwar , Jean-David Ruvini

Active Clustering with Model-Based Uncertainty Reduction

Semi-supervised clustering seeks to augment traditional clustering methods by incorporating side information provided via human expertise in order to increase the semantic meaningfulness of the resulting clusters. However, most current…

Machine Learning · Computer Science 2014-02-17 Caiming Xiong , David Johnson , Jason J. Corso