Related papers: Relabelling Algorithms for Large Dataset Mixture M…

Relabelling in Bayesian mixture models by pivotal units

In this paper a simple procedure to deal with label switching when exploring complex posterior distributions by MCMC algorithms is proposed. Although it cannot be generalized to any situation, it may be handy in many applications because of…

Computation · Statistics 2016-09-14 Leonardo Egidi , Roberta Pappadà , Francesco Pauli , Nicola Torelli

label.switching: An R Package for Dealing with the Label Switching Problem in MCMC Outputs

Label switching is a well-known and fundamental problem in Bayesian estimation of mixture or hidden Markov models. In case that the prior distribution of the model parameters is the same for all states, then both the likelihood and…

Computation · Statistics 2016-03-07 Panagiotis Papastamoulis

Mixture models applied to heterogeneous populations

Mixture models provide a flexible representation of heterogeneity in a finite number of latent classes. From the Bayesian point of view, Markov Chain Monte Carlo methods provide a way to draw inferences from these models. In particular,…

Methodology · Statistics 2020-05-06 Carolina Valani Cavalcante , Kelly Cristina Mota Gonçalves

Anchored Bayesian Gaussian Mixture Models

Finite mixtures are a flexible modeling tool for irregularly shaped densities and samples from heterogeneous populations. When modeling with mixtures using an exchangeable prior on the component features, the component labels are arbitrary…

Methodology · Statistics 2020-07-10 Deborah Kunkel , Mario Peruggia

Neural Clustering Processes

Probabilistic clustering models (or equivalently, mixture models) are basic building blocks in countless statistical models and involve latent random variables over discrete spaces. For these models, posterior inference methods can be…

Machine Learning · Statistics 2020-06-24 Ari Pakman , Yueqi Wang , Catalin Mitelut , JinHyung Lee , Liam Paninski

Overfitting Bayesian Mixtures of Factor Analyzers with an Unknown Number of Components

Recent advances on overfitting Bayesian mixture models provide a solid and straightforward approach for inferring the underlying number of clusters and model parameters in heterogeneous datasets. The applicability of such a framework in…

Methodology · Statistics 2018-03-29 Panagiotis Papastamoulis

A Statistical Approach to Increase Classification Accuracy in Supervised Learning Algorithms

Probabilistic mixture models have been widely used for different machine learning and pattern recognition tasks such as clustering, dimensionality reduction, and classification. In this paper, we focus on trying to solve the most common…

Machine Learning · Computer Science 2020-04-08 Gustavo A Valencia-Zapata , Daniel Mejia , Gerhard Klimeck , Michael Zentner , Okan Ersoy

Balancing Methods for Multi-label Text Classification with Long-Tailed Class Distribution

Multi-label text classification is a challenging task because it requires capturing label dependencies. It becomes even more challenging when class distribution is long-tailed. Resampling and re-weighting are common approaches used for…

Computation and Language · Computer Science 2021-10-19 Yi Huang , Buse Giledereli , Abdullatif Köksal , Arzucan Özgür , Elif Ozkirimli

Overfitting Bayesian Mixture Models with an Unknown Number of Components

This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of…

Methodology · Statistics 2015-08-25 Zoe van Havre , Nicole White , Judith Rousseau , Kerrie Mengersen

mldr.resampling: Efficient Reference Implementations of Multilabel Resampling Algorithms

Resampling algorithms are a useful approach to deal with imbalanced learning in multilabel scenarios. These methods have to deal with singularities in the multilabel data, such as the occurrence of frequent and infrequent labels in the same…

Machine Learning · Computer Science 2025-01-22 Antonio J. Rivera , Miguel A. Dávila , David Elizondo , María J. del Jesus , Francisco Charte

Label Switching Problem in Bayesian Analysis for Gravitational Wave Astronomy

The label switching problem arises in the Bayesian analysis of models containing multiple indistinguishable parameters with arbitrary ordering. Any permutation of these parameters is equivalent, therefore models with many such parameters…

Instrumentation and Methods for Astrophysics · Physics 2019-10-30 Riccardo Buscicchio , Elinore Roebber , Janna M. Goldstein , Christopher J. Moore

A Non-Iterative Quantile Change Detection Method in Mixture Model with Heavy-Tailed Components

Estimating parameters of mixture model has wide applications ranging from classification problems to estimating of complex distributions. Most of the current literature on estimating the parameters of the mixture densities are based on…

Machine Learning · Statistics 2020-06-23 Yuantong Li , Qi Ma , Sujit K. Ghosh

Towards Label Imbalance in Multi-label Classification with Many Labels

In multi-label classification, an instance may be associated with a set of labels simultaneously. Recently, the research on multi-label classification has largely shifted its focus to the other end of the spectrum where the number of labels…

Machine Learning · Computer Science 2016-04-06 Li Li , Houfeng Wang

Regression with Label Permutation in Generalized Linear Model

The assumption that response and predictor belong to the same statistical unit may be violated in practice. Unbiased estimation and recovery of true label ordering based on unlabeled data are challenging tasks and have attracted increasing…

Methodology · Statistics 2022-06-24 Guanhua Fang , Ping Li

Posterior Re-calibration for Imbalanced Datasets

Neural Networks can perform poorly when the training label distribution is heavily imbalanced, as well as when the testing data differs from the training distribution. In order to deal with shift in the testing label distribution, which…

Machine Learning · Computer Science 2020-10-23 Junjiao Tian , Yen-Cheng Liu , Nathan Glaser , Yen-Chang Hsu , Zsolt Kira

Dealing with Difficult Minority Labels in Imbalanced Mutilabel Data Sets

Multilabel classification is an emergent data mining task with a broad range of real world applications. Learning from imbalanced multilabel data is being deeply studied latterly, and several resampling methods have been proposed in the…

Machine Learning · Computer Science 2018-02-15 Francisco Charte , Antonio J. Rivera , María J. del Jesus , Francisco Herrera

Tackling Multilabel Imbalance through Label Decoupling and Data Resampling Hybridization

The learning from imbalanced data is a deeply studied problem in standard classification and, in recent times, also in multilabel classification. A handful of multilabel resampling methods have been proposed in late years, aiming to balance…

Machine Learning · Computer Science 2018-02-15 Francisco Charte , Antonio J. Rivera , María J. del Jesus , Francisco Herrera

GenLabel: Mixup Relabeling using Generative Models

Mixup is a data augmentation method that generates new data points by mixing a pair of input data. While mixup generally improves the prediction performance, it sometimes degrades the performance. In this paper, we first identify the main…

Machine Learning · Computer Science 2022-01-10 Jy-yong Sohn , Liang Shang , Hongxu Chen , Jaekyun Moon , Dimitris Papailiopoulos , Kangwook Lee

Large Language Models Do Multi-Label Classification Differently

Multi-label classification is prevalent in real-world settings, but the behavior of Large Language Models (LLMs) in this setting is understudied. We investigate how autoregressive LLMs perform multi-label classification, focusing on…

Computation and Language · Computer Science 2025-11-12 Marcus Ma , Georgios Chochlakis , Niyantha Maruthu Pandiyan , Jesse Thomason , Shrikanth Narayanan

Linear Regression with Shuffled Labels

Is it possible to perform linear regression on datasets whose labels are shuffled with respect to the inputs? We explore this question by proposing several estimators that recover the weights of a noisy linear model from labels that are…

Machine Learning · Statistics 2017-05-05 Abubakar Abid , Ada Poon , James Zou