Related papers: PriorBoost: An Adaptive Algorithm for Learning fro…

Learning from Aggregated Data: Curated Bags versus Random Bags

Protecting user privacy is a major concern for many machine learning systems that are deployed at scale and collect from a diverse set of population. One way to address this concern is by collecting and releasing data labels in an…

Machine Learning · Computer Science 2023-05-19 Lin Chen , Gang Fu , Amin Karbasi , Vahab Mirrokni

Learning from Aggregate responses: Instance Level versus Bag Level Loss Functions

Due to the rise of privacy concerns, in many practical applications the training data is aggregated before being shared with the learner, in order to protect privacy of users' sensitive responses. In an aggregate learning framework, the…

Machine Learning · Computer Science 2024-01-23 Adel Javanmard , Lin Chen , Vahab Mirrokni , Ashwinkumar Badanidiyuru , Gang Fu

Aggregating Data for Optimal and Private Learning

Multiple Instance Regression (MIR) and Learning from Label Proportions (LLP) are learning frameworks arising in many applications, where the training data is partitioned into disjoint sets or bags, and only an aggregate label i.e.,…

Machine Learning · Computer Science 2024-12-02 Sushant Agarwal , Yukti Makhija , Rishi Saket , Aravindan Raghuveer

LoBoost: Fast Model-Native Local Conformal Prediction for Gradient-Boosted Trees

Gradient-boosted decision trees are among the strongest off-the-shelf predictors for tabular regression, but point predictions alone do not quantify uncertainty. Conformal prediction provides distribution-free marginal coverage, yet split…

Machine Learning · Statistics 2026-02-27 Vagner Santos , Victor Coscrato , Luben Cabezas , Rafael Izbicki , Thiago Ramos

Evolutionary bagging for ensemble learning

Ensemble learning has gained success in machine learning with major advantages over other learning methods. Bagging is a prominent ensemble learning method that creates subgroups of data, known as bags, that are trained by individual…

Neural and Evolutionary Computing · Computer Science 2022-09-07 Giang Ngo , Rodney Beard , Rohitash Chandra

Gradient Boosting for Linear Mixed Models

Gradient boosting from the field of statistical learning is widely known as a powerful framework for estimation and selection of predictor effects in various regression models by adapting concepts from classification theory. Current…

Methodology · Statistics 2020-11-03 Colin Griesbach , Benjamin Säfken , Elisabeth Waldmann

PaloBoost: An Overfitting-robust TreeBoost with Out-of-Bag Sample Regularization Techniques

Stochastic Gradient TreeBoost is often found in many winning solutions in public data science challenges. Unfortunately, the best performance requires extensive parameter tuning and can be prone to overfitting. We propose PaloBoost, a…

Machine Learning · Statistics 2018-07-24 Yubin Park , Joyce C. Ho

An Aggregate and Iterative Disaggregate Algorithm with Proven Optimality in Machine Learning

We propose a clustering-based iterative algorithm to solve certain optimization problems in machine learning, where we start the algorithm by aggregating the original data, solving the problem on aggregated data, and then in subsequent…

Machine Learning · Statistics 2017-01-23 Young Woong Park , Diego Klabjan

MixBag: Bag-Level Data Augmentation for Learning from Label Proportions

Learning from label proportions (LLP) is a promising weakly supervised learning problem. In LLP, a set of instances (bag) has label proportions, but no instance-level labels are given. LLP aims to train an instance-level classifier by using…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Takanori Asanomi , Shinnosuke Matsuo , Daiki Suehiro , Ryoma Bise

Getting Better from Worse: Augmented Bagging and a Cautionary Tale of Variable Importance

As the size, complexity, and availability of data continues to grow, scientists are increasingly relying upon black-box learning algorithms that can often provide accurate predictions with minimal a priori model specifications. Tools like…

Machine Learning · Statistics 2020-11-10 Lucas Mentch , Siyu Zhou

Fast learning from label proportions with small bags

In learning from label proportions (LLP), the instances are grouped into bags, and the task is to learn an instance classifier given relative class proportions in training bags. LLP is useful when obtaining individual instance labels is…

Machine Learning · Computer Science 2022-11-01 Denis Baručić , Jan Kybic

Nearly Optimal Sample Complexity for Learning with Label Proportions

We investigate Learning from Label Proportions (LLP), a partial information setting where examples in a training set are grouped into bags, and only aggregate label values in each bag are available. Despite the partial observability, the…

Machine Learning · Computer Science 2025-06-02 Robert Busa-Fekete , Travis Dick , Claudio Gentile , Haim Kaplan , Tomer Koren , Uri Stemmer

A Generative Bayesian Model for Aggregating Experts' Probabilities

In order to improve forecasts, a decisionmaker often combines probabilities given by various sources, such as human experts and machine learning classifiers. When few training data are available, aggregation can be improved by incorporating…

Machine Learning · Computer Science 2012-07-19 Joseph Kahn

Progressive Boosting for Class Imbalance

Pattern recognition applications often suffer from skewed data distributions between classes, which may vary during operations w.r.t. the design data. Two-class classification systems designed using skewed data tend to recognize the…

Machine Learning · Computer Science 2019-12-02 Roghayeh Soleymani , Eric Granger , Giorgio Fumera

A naive aggregation algorithm for improving generalization in a class of learning problems

In this brief paper, we present a naive aggregation algorithm for a typical learning problem with expert advice setting, in which the task of improving generalization, i.e., model validation, is embedded in the learning process as a…

Machine Learning · Computer Science 2024-09-09 Getachew K Befekadu

Easy Learning from Label Proportions

We consider the problem of Learning from Label Proportions (LLP), a weakly supervised classification setup where instances are grouped into "bags", and only the frequency of class labels at each bag is available. Albeit, the objective of…

Machine Learning · Computer Science 2023-02-15 Robert Istvan Busa-Fekete , Heejin Choi , Travis Dick , Claudio Gentile , Andres Munoz medina

Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization

The pursuit of long-term autonomy mandates that machine learning models must continuously adapt to their changing environments and learn to solve new tasks. Continual learning seeks to overcome the challenge of catastrophic forgetting,…

Machine Learning · Computer Science 2024-07-25 Jack Foster , Alexandra Brintrup

Sample-Efficient Optimization over Generative Priors via Coarse Learnability

We study zeroth-order optimization where solutions must minimize a cost $d(s)$ while maintaining high probability under a complex generative prior $L(s)$ (e.g., a parameterized model). This reduces to sampling from a target distribution…

Machine Learning · Computer Science 2026-05-06 Pranjal Awasthi , Sreenivas Gollapudi , Ravi Kumar , Kamesh Munagala

Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing

Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance. While prior works have demonstrated the benefits of specific heuristic…

Machine Learning · Computer Science 2025-05-22 Adel Javanmard , Rudrajit Das , Alessandro Epasto , Vahab Mirrokni

A Corrective Training Algorithm for Adaptive Learning in Bag Generation

The sampling problem in training corpus is one of the major sources of errors in corpus-based applications. This paper proposes a corrective training algorithm to best-fit the run-time context domain in the application of bag generation. It…

cmp-lg · Computer Science 2008-02-03 Hsin-Hsi Chen , Yue-Shi Lee