Related papers: An Information-Theoretic Framework for Supervised …

Information-Theoretic Framework for Understanding Modern Machine-Learning

We introduce an information-theoretic framework that views learning as universal prediction under log loss, characterized through regret bounds. Central to the framework is an effective notion of architecture-based model complexity, defined…

Machine Learning · Computer Science 2025-11-04 Meir Feder , Ruediger Urbanke , Yaniv Fogel

Unveiling the Training Dynamics of ReLU Networks through a Linear Lens

Deep neural networks, particularly those employing Rectified Linear Units (ReLU), are often perceived as complex, high-dimensional, non-linear systems. This complexity poses a significant challenge to understanding their internal learning…

Machine Learning · Computer Science 2025-11-11 Longqing Ye

Sparsity-aware generalization theory for deep neural networks

Deep artificial neural networks achieve surprising generalization abilities that remain poorly understood. In this paper, we present a new approach to analyzing generalization for deep feed-forward ReLU networks that takes advantage of the…

Machine Learning · Computer Science 2023-07-06 Ramchandran Muthukumar , Jeremias Sulam

A Unified Information-Theoretic Framework for Meta-Learning Generalization

In recent years, information-theoretic generalization bounds have gained increasing attention for analyzing the generalization capabilities of meta-learning algorithms. However, existing results are confined to two-step bounds, failing to…

Machine Learning · Statistics 2025-10-14 Wen Wen , Tieliang Gong , Yuxin Dong , Zeyu Gao , Yong-Jin Liu

A theoretical framework for deep locally connected ReLU network

Understanding theoretical properties of deep and locally connected nonlinear network, such as deep convolutional neural network (DCNN), is still a hard problem despite its empirical success. In this paper, we propose a novel theoretical…

Machine Learning · Computer Science 2018-10-01 Yuandong Tian

Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks

In spite of finite dimension ReLU neural networks being a consistent factor behind recent deep learning successes, a theory of feature learning in these models remains elusive. Currently, insightful theories still rely on assumptions…

Machine Learning · Computer Science 2025-04-01 Devon Jarvis , Richard Klein , Benjamin Rosman , Andrew M. Saxe

A Study of the Mathematics of Deep Learning

"Deep Learning"/"Deep Neural Nets" is a technological marvel that is now increasingly deployed at the cutting-edge of artificial intelligence tasks. This dramatic success of deep learning in the last few years has been hinged on an enormous…

Machine Learning · Computer Science 2021-04-30 Anirbit Mukherjee

Theoretical analysis of deep neural networks for temporally dependent observations

Deep neural networks are powerful tools to model observations over time with non-linear patterns. Despite the widespread use of neural networks in such settings, most theoretical developments of deep neural networks are under the assumption…

Machine Learning · Statistics 2022-10-24 Mingliang Ma , Abolfazl Safikhani

On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network Parametrization

Deep Q-learning based algorithms have been applied successfully in many decision making problems, while their theoretical foundations are not as well understood. In this paper, we study a Fitted Q-Iteration with two-layer ReLU neural…

Machine Learning · Computer Science 2023-02-01 Mudit Gaur , Vaneet Aggarwal , Mridul Agarwal

Injectivity of ReLU-layers: Tools from Frame Theory

Injectivity is the defining property of a mapping that ensures no information is lost and any input can be perfectly reconstructed from its output. By performing hard thresholding, the ReLU function naturally interferes with this property,…

Machine Learning · Computer Science 2024-12-02 Daniel Haider , Martin Ehler , Peter Balazs

Computational Complexity of Learning Neural Networks: Smoothness and Degeneracy

Understanding when neural networks can be learned efficiently is a fundamental question in learning theory. Existing hardness results suggest that assumptions on both the input distribution and the network's weights are necessary for…

Machine Learning · Computer Science 2023-10-05 Amit Daniely , Nathan Srebro , Gal Vardi

Diversified Ensemble of Independent Sub-Networks for Robust Self-Supervised Representation Learning

Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning. However, deep ensembles often come with high computational costs and memory…

Machine Learning · Statistics 2023-09-04 Amirhossein Vahidi , Lisa Wimmer , Hüseyin Anil Gündüz , Bernd Bischl , Eyke Hüllermeier , Mina Rezaei

Information-Theoretic Foundations for Machine Learning

The progress of machine learning over the past decade is undeniable. In retrospect, it is both remarkable and unsettling that this progress was achievable with little to no rigorous theory to guide experimentation. Despite this fact,…

Machine Learning · Statistics 2025-05-23 Hong Jun Jeon , Benjamin Van Roy

Deep Learning Through the Lens of Example Difficulty

Existing work on understanding deep learning often employs measures that compress all data-dependent information into a few numbers. In this work, we adopt a perspective based on the role of individual examples. We introduce a measure of…

Machine Learning · Computer Science 2021-06-21 Robert J. N. Baldock , Hartmut Maennel , Behnam Neyshabur

Generalization analysis with deep ReLU networks for metric and similarity learning

While metric and similarity learning has been extensively studied from several theoretical perspectives, a rigorous understanding of its generalization performance is still lacking. In this paper, we investigate the generalization behavior…

Machine Learning · Statistics 2026-05-19 Junyu Zhou , Puyu Wang , Ding-Xuan Zhou

Sparse Deep Learning: A New Framework Immune to Local Traps and Miscalibration

Deep learning has powered recent successes of artificial intelligence (AI). However, the deep neural network, as the basic model of deep learning, has suffered from issues such as local traps and miscalibration. In this paper, we provide a…

Machine Learning · Statistics 2021-12-03 Yan Sun , Wenjun Xiong , Faming Liang

Realization of spatial sparseness by deep ReLU nets with massive data

The great success of deep learning poses urgent challenges for understanding its working mechanism and rationality. The depth, structure, and massive size of the data are recognized to be three key ingredients for deep learning. Most of the…

Machine Learning · Computer Science 2019-12-17 Charles K. Chui , Shao-Bo Lin , Bo Zhang , Ding-Xuan Zhou

Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization

While deep learning is successful in a number of applications, it is not yet well understood theoretically. A satisfactory theoretical characterization of deep learning however, is beginning to emerge. It covers the following questions: 1)…

Machine Learning · Computer Science 2019-08-27 Tomaso Poggio , Andrzej Banburski , Qianli Liao

ReLU Neural Networks with Linear Layers are Biased Towards Single- and Multi-Index Models

Neural networks often operate in the overparameterized regime, in which there are far more parameters than training samples, allowing the training data to be fit perfectly. That is, training the network effectively learns an interpolating…

Machine Learning · Computer Science 2025-03-19 Suzanna Parkinson , Greg Ongie , Rebecca Willett

A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks

Deep neural networks' remarkable ability to correctly fit training data when optimized by gradient-based algorithms is yet to be fully understood. Recent theoretical results explain the convergence for ReLU networks that are wider than…

Machine Learning · Computer Science 2021-02-09 Asaf Noy , Yi Xu , Yonathan Aflalo , Lihi Zelnik-Manor , Rong Jin