Related papers: Approximability and Generalisation

Proper Learnability and the Role of Unlabeled Data

Proper learning refers to the setting in which learners must emit predictors in the underlying hypothesis class $H$, and often leads to learners with simple algorithmic forms (e.g. empirical risk minimization (ERM), structural risk…

Machine Learning · Computer Science 2025-12-10 Julian Asilis , Siddartha Devic , Shaddin Dughmi , Vatsal Sharan , Shang-Hua Teng

Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning

Machine learning is now ubiquitous in societal decision-making, for example in evaluating job candidates or loan applications, and it is increasingly important to take into account how classified agents will react to the learning…

Machine Learning · Computer Science 2025-08-08 Dravyansh Sharma , Alec Sun

A PAC-Bayesian Link Between Generalisation and Flat Minima

Modern machine learning usually involves predictors in the overparameterised setting (number of trained parameters greater than dataset size), and their training yields not only good performance on training data, but also good…

Machine Learning · Statistics 2025-02-12 Maxime Haddouche , Paul Viallard , Umut Simsekli , Benjamin Guedj

Probably Approximately Precision and Recall Learning

Precision and Recall are fundamental metrics in machine learning tasks where both accurate predictions and comprehensive coverage are essential, such as in multi-label learning, language generation, medical studies, and recommender systems.…

Machine Learning · Computer Science 2025-10-27 Lee Cohen , Yishay Mansour , Shay Moran , Han Shao

Personalized Prediction By Learning Halfspace Reference Classes Under Well-Behaved Distribution

In machine learning applications, predictive models are trained to serve future queries across the entire data distribution. Real-world data often demands excessively complex models to achieve competitive performance, however, sacrificing…

Machine Learning · Computer Science 2025-09-22 Jizhou Huang , Brendan Juba

Principled Approximation Methods for Efficient and Scalable Deep Learning

Recent progress in deep learning has been driven by increasingly larger models. However, their computational and energy demands have grown proportionally, creating significant barriers to their deployment and to a wider adoption of deep…

Machine Learning · Computer Science 2025-09-16 Pedro Savarese

Probably Approximately Correct Constrained Learning

As learning solutions reach critical applications in social, industrial, and medical domains, the need to curtail their behavior has become paramount. There is now ample evidence that without explicit tailoring, learning can lead to biased,…

Machine Learning · Computer Science 2021-02-19 Luiz F. O. Chamon , Alejandro Ribeiro

Understanding Generalization in Deep Learning via Tensor Methods

Deep neural networks generalize well on unseen data though the number of parameters often far exceeds the number of training examples. Recently proposed complexity measures have provided insights to understanding the generalizability in…

Machine Learning · Computer Science 2020-05-12 Jingling Li , Yanchao Sun , Jiahao Su , Taiji Suzuki , Furong Huang

On statistical learning via the lens of compression

This work continues the study of the relationship between sample compression schemes and statistical learning, which has been mostly investigated within the framework of binary classification. The central theme of this work is establishing…

Machine Learning · Computer Science 2017-01-02 Ofir David , Shay Moran , Amir Yehudayoff

To understand deep learning we need to understand kernel learning

Generalization performance of classifiers in deep learning has recently become a subject of intense study. Deep models, typically over-parametrized, tend to fit the training data exactly. Despite this "overfitting", they perform well on…

Machine Learning · Statistics 2018-06-18 Mikhail Belkin , Siyuan Ma , Soumik Mandal

Learnability, Sample Complexity, and Hypothesis Class Complexity for Regression Models

The goal of a learning algorithm is to receive a training data set as input and provide a hypothesis that can generalize to all possible data points from a domain set. The hypothesis is chosen from hypothesis classes with potentially…

Machine Learning · Statistics 2023-03-29 Soosan Beheshti , Mahdi Shamsi

Approximation and generalization properties of the random projection classification method

The generalization gap of a classifier is related to the complexity of the set of functions among which the classifier is chosen. We study a family of low-complexity classifiers consisting of thresholding a random one-dimensional feature.…

Machine Learning · Computer Science 2024-09-12 Mireille Boutin , Evzenie Coupkova

Compressed and Penalized Linear Regression

Modern applications require methods that are computationally feasible on large datasets but also preserve statistical efficiency. Frequently, these two concerns are seen as contradictory: approximation methods that enable computation are…

Methodology · Statistics 2021-06-11 Darren Homrighausen , Daniel J. McDonald

Compression, Generalization and Learning

A compression function is a map that slims down an observational set into a subset of reduced size, while preserving its informational content. In multiple applications, the condition that one new observation makes the compressed set change…

Machine Learning · Computer Science 2024-01-09 Marco C. Campi , Simone Garatti

A Survey of Learning on Small Data: Generalization, Optimization, and Challenge

Learning on big data brings success for artificial intelligence (AI), but the annotation and training costs are expensive. In future, learning on small data that approximates the generalization ability of big data is one of the ultimate…

Machine Learning · Computer Science 2023-06-07 Xiaofeng Cao , Weixin Bu , Shengjun Huang , Minling Zhang , Ivor W. Tsang , Yew Soon Ong , James T. Kwok

Minimum Description Length and Generalization Guarantees for Representation Learning

A major challenge in designing efficient statistical supervised learning algorithms is finding representations that perform well not only on available training samples but also on unseen data. While the study of representation learning has…

Machine Learning · Statistics 2024-02-06 Milad Sefidgaran , Abdellatif Zaidi , Piotr Krasnowski

Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers

The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of…

Machine Learning · Statistics 2019-03-04 Hiva Ghanbari , Minhan Li , Katya Scheinberg

When Hardness of Approximation Meets Hardness of Learning

A supervised learning algorithm has access to a distribution of labeled examples, and needs to return a function (hypothesis) that correctly labels the examples. The hypothesis of the learner is taken from some fixed class of functions…

Machine Learning · Computer Science 2020-08-25 Eran Malach , Shai Shalev-Shwartz

Tighter Learning Guarantees on Digital Computers via Concentration of Measure on Finite Spaces

Machine learning models with inputs in a Euclidean space $\mathbb{R}^d$, when implemented on digital computers, generalize, and their generalization gap converges to $0$ at a rate of $c/N^{1/2}$ concerning the sample size $N$. However, the…

Machine Learning · Computer Science 2026-05-14 Anastasis Kratsios , A. Martina Neuman , Gudmund Pammer

PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization

While there has been progress in developing non-vacuous generalization bounds for deep neural networks, these bounds tend to be uninformative about why deep learning works. In this paper, we develop a compression approach based on…

Machine Learning · Computer Science 2022-11-28 Sanae Lotfi , Marc Finzi , Sanyam Kapoor , Andres Potapczynski , Micah Goldblum , Andrew Gordon Wilson