Related papers: A Statistical Approach to Increase Classification …

On Learning Prediction-Focused Mixtures

Probabilistic models help us encode latent structures that both model the data and are ideally also useful for specific downstream tasks. Among these, mixture models and their time-series counterparts, hidden Markov models, identify…

Machine Learning · Computer Science 2021-10-29 Abhishek Sharma , Catherine Zeng , Sanjana Narayanan , Sonali Parbhoo , Finale Doshi-Velez

Unsupervised Learning via Mixtures of Skewed Distributions with Hypercube Contours

Mixture models whose components have skewed hypercube contours are developed via a generalization of the multivariate shifted asymmetric Laplace density. Specifically, we develop mixtures of multiple scaled shifted asymmetric Laplace…

Methodology · Statistics 2023-03-28 Brian C. Franczak , Cristina Tortora , Ryan P. Browne , Paul D. McNicholas

Mixtures Closest to a Given Measure: A Semidefinite Programming Approach

Mixture models, such as Gaussian mixture models, are widely used in machine learning to represent complex data distributions. A key challenge, especially in high-dimensional settings, is to determine the mixture order and estimate the…

Optimization and Control · Mathematics 2025-09-30 Srećko Đurašinović , Jean-Bernard Lasserre , Victor Magron

Probabilistic Diagnostic Tests for Degradation Problems in Supervised Learning

Several studies point out different causes of performance degradation in supervised machine learning. Problems such as class imbalance, overlapping, small-disjuncts, noisy labels, and sparseness limit accuracy in classification algorithms.…

Machine Learning · Computer Science 2020-04-17 Gustavo A. Valencia-Zapata , Carolina Gonzalez-Canas , Michael G. Zentner , Okan Ersoy , Gerhard Klimeck

Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters

Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used…

Optimization and Control · Mathematics 2020-08-28 Filip Hanzely

Semi-supervised Logistic Learning Based on Exponential Tilt Mixture Models

Consider semi-supervised learning for classification, where both labeled and unlabeled data are available for training. The goal is to exploit both datasets to achieve higher prediction accuracy than just using labeled data alone. We…

Machine Learning · Statistics 2019-06-20 Xinwei Zhang , Zhiqiang Tan

Mixup Regularization: A Probabilistic Perspective

In recent years, mixup regularization has gained popularity as an effective way to improve the generalization performance of deep learning models by training on convex combinations of training data. While many mixup variants have been…

Machine Learning · Computer Science 2025-06-16 Yousef El-Laham , Niccolò Dalmasso , Svitlana Vyetrenko , Vamsi K. Potluru , Manuela Veloso

Flexibly Regularized Mixture Models and Application to Image Segmentation

Probabilistic finite mixture models are widely used for unsupervised clustering. These models can often be improved by adapting them to the topology of the data. For instance, in order to classify spatially adjacent data points similarly,…

Computer Vision and Pattern Recognition · Computer Science 2022-02-09 Jonathan Vacher , Claire Launay , Ruben Coen-Cagli

Semi-supervised Deep Learning for Image Classification with Distribution Mismatch: A Survey

Deep learning methodologies have been employed in several different fields, with an outstanding success in image recognition applications, such as material quality control, medical imaging, autonomous driving, etc. Deep learning models rely…

Computer Vision and Pattern Recognition · Computer Science 2022-03-11 Saul Calderon-Ramirez , Shengxiang Yang , David Elizondo

Relabelling Algorithms for Large Dataset Mixture Models

Mixture models are flexible tools in density estimation and classification problems. Bayesian estimation of such models typically relies on sampling from the posterior distribution using Markov chain Monte Carlo. Label switching arises…

Applications · Statistics 2014-03-11 Wanchuang Zhu , Yanan Fan

Predictive Multiplicity in Probabilistic Classification

Machine learning models are often used to inform real world risk assessment tasks: predicting consumer default risk, predicting whether a person suffers from a serious illness, or predicting a person's risk to appear in court. Given…

Machine Learning · Computer Science 2023-06-27 Jamelle Watson-Daniels , David C. Parkes , Berk Ustun

Quantifying Intrinsic Uncertainty in Classification via Deep Dirichlet Mixture Networks

With the widespread success of deep neural networks in science and technology, it is becoming increasingly important to quantify the uncertainty of the predictions produced by deep learning. In this paper, we introduce a new method that…

Machine Learning · Computer Science 2019-08-15 Qingyang Wu , He Li , Lexin Li , Zhou Yu

Neural Clustering Processes

Probabilistic clustering models (or equivalently, mixture models) are basic building blocks in countless statistical models and involve latent random variables over discrete spaces. For these models, posterior inference methods can be…

Machine Learning · Statistics 2020-06-24 Ari Pakman , Yueqi Wang , Catalin Mitelut , JinHyung Lee , Liam Paninski

Distributed Parameter Estimation via Pseudo-likelihood

Estimating statistical models within sensor networks requires distributed algorithms, in which both data and computation are distributed across the nodes of the network. We propose a general approach for distributed learning based on…

Machine Learning · Computer Science 2012-07-03 Qiang Liu , Alexander Ihler

A Self-Adaptive Synthetic Over-Sampling Technique for Imbalanced Classification

Traditionally, in supervised machine learning, (a significant) part of the available data (usually 50% to 80%) is used for training and the rest for validation. In many problems, however, the data is highly imbalanced in regard to different…

Machine Learning · Computer Science 2020-04-21 Xiaowei Gu , Plamen P Angelov , Eduardo Almeida Soares

Selective Mixup Helps with Distribution Shifts, But Not (Only) because of Mixup

Mixup is a highly successful technique to improve generalization of neural networks by augmenting the training data with combinations of random pairs. Selective mixup is a family of methods that apply mixup to specific pairs, e.g. only…

Machine Learning · Computer Science 2023-06-06 Damien Teney , Jindong Wang , Ehsan Abbasnejad

Mixture Models and Networks -- Overview of Stochastic Blockmodelling

Mixture models are probabilistic models aimed at uncovering and representing latent subgroups within a population. In the realm of network data analysis, the latent subgroups of nodes are typically identified by their connectivity…

Methodology · Statistics 2020-05-27 Giacomo De Nicola , Benjamin Sischka , Göran Kauermann

Variable Selection for Clustering and Classification

As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering…

Computation · Statistics 2013-03-22 Jeffrey L. Andrews , Paul D. McNicholas

Learning with Clustering Structure

We study supervised learning problems using clustering constraints to impose structure on either features or samples, seeking to help both prediction and interpretation. The problem of clustering features arises naturally in text…

Machine Learning · Computer Science 2016-09-20 Vincent Roulet , Fajwel Fogel , Alexandre d'Aspremont , Francis Bach

Learning Mixtures of Separable Dictionaries for Tensor Data: Analysis and Algorithms

This work addresses the problem of learning sparse representations of tensor data using structured dictionary learning. It proposes learning a mixture of separable dictionaries to better capture the structure of tensor data by generalizing…

Machine Learning · Computer Science 2020-06-16 Mohsen Ghassemi , Zahra Shakeri , Anand D. Sarwate , Waheed U. Bajwa