English
Related papers

Related papers: On Infinite-Width Hypernetworks

200 papers

Deep neural networks' remarkable ability to correctly fit training data when optimized by gradient-based algorithms is yet to be fully understood. Recent theoretical results explain the convergence for ReLU networks that are wider than…

Machine Learning · Computer Science 2021-02-09 Asaf Noy , Yi Xu , Yonathan Aflalo , Lihi Zelnik-Manor , Rong Jin

This work explores hypernetworks: an approach of using a one network, also known as a hypernetwork, to generate the weights for another network. Hypernetworks provide an abstraction that is similar to what is found in nature: the…

Machine Learning · Computer Science 2016-12-02 David Ha , Andrew Dai , Quoc V. Le

While deep learning is successful in a number of applications, it is not yet well understood theoretically. A satisfactory theoretical characterization of deep learning however, is beginning to emerge. It covers the following questions: 1)…

Machine Learning · Computer Science 2019-08-27 Tomaso Poggio , Andrzej Banburski , Qianli Liao

This theoretical paper is devoted to developing a rigorous theory for demystifying the global convergence phenomenon in a challenging scenario: learning over-parameterized Rectified Linear Unit (ReLU) nets for very high dimensional dataset…

Machine Learning · Computer Science 2022-06-08 Peng He

We study the realization map of deep ReLU networks, focusing on when a function determines its parameters up to scaling and permutation. To analyze hidden redundancies beyond these standard symmetries, we introduce a framework based on…

Machine Learning · Computer Science 2026-05-21 Moritz Grillo , Guido Montúfar

Overparametrization is a key factor in the absence of convexity to explain global convergence of gradient descent (GD) for neural networks. Beside the well studied lazy regime, infinite width (mean field) analysis has been developed for…

Neural and Evolutionary Computing · Computer Science 2023-02-07 Raphaël Barboni , Gabriel Peyré , François-Xavier Vialard

We develop a convex analytic approach to analyze finite width two-layer ReLU networks. We first prove that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set, where simple…

Machine Learning · Computer Science 2021-09-01 Tolga Ergen , Mert Pilanci

We draw connections between simple neural networks and under-determined linear systems to comprehensively explore several interesting theoretical questions in the study of neural networks. First, we emphatically show that it is unsurprising…

Numerical Analysis · Mathematics 2020-11-02 Austin R. Benson , Anil Damle , Alex Townsend

Hypernetworks, or hypernets for short, are neural networks that generate weights for another neural network, known as the target network. They have emerged as a powerful deep learning technique that allows for greater flexibility,…

Machine Learning · Computer Science 2025-01-03 Vinod Kumar Chauhan , Jiandong Zhou , Ping Lu , Soheila Molaei , David A. Clifton

Neural networks often operate in the overparameterized regime, in which there are far more parameters than training samples, allowing the training data to be fit perfectly. That is, training the network effectively learns an interpolating…

Machine Learning · Computer Science 2025-03-19 Suzanna Parkinson , Greg Ongie , Rebecca Willett

Many modern neural network architectures are trained in an overparameterized regime where the parameters of the model exceed the size of the training dataset. Sufficiently overparameterized neural network architectures in principle have the…

Machine Learning · Computer Science 2019-02-14 Samet Oymak , Mahdi Soltanolkotabi

We present a framework to define a large class of neural networks for which, by construction, training by gradient flow provably reaches arbitrarily low loss when the number of parameters grows. Distinct from the fixed-space global…

Optimization and Control · Mathematics 2025-01-13 David A. R. Robin , Kevin Scaman , Marc Lelarge

Recently, deep learning approaches with various network architectures have achieved significant performance improvement over existing iterative reconstruction methods in various imaging problems. However, it is still unclear why these deep…

Machine Learning · Statistics 2018-01-26 Jong Chul Ye , Yoseob Han , Eunju Cha

In practice, multi-task learning (through learning features shared among tasks) is an essential property of deep neural networks (NNs). While infinite-width limits of NNs can provide good intuition for their generalization behavior, the…

Machine Learning · Computer Science 2022-10-21 Jakob Heiss , Josef Teichmann , Hanna Wutte

Overparameterized fully-connected neural networks have been shown to behave like kernel models when trained with gradient descent, under mild conditions on the width, the learning rate, and the parameter initialization. In the limit of…

Machine Learning · Computer Science 2025-11-11 William St-Arnaud , Margarida Carvalho , Golnoosh Farnadi

This extended abstract describes a framework for analyzing the expressiveness, learning, and (structural) generalization of hypergraph neural networks (HyperGNNs). Specifically, we focus on how HyperGNNs can learn from finite datasets and…

Machine Learning · Computer Science 2023-03-10 Zhezheng Luo , Jiayuan Mao , Joshua B. Tenenbaum , Leslie Pack Kaelbling

Large pre-trained models, or foundation models, have shown impressive performance when adapted to a variety of downstream tasks, often out-performing specialized models. Hypernetworks, neural networks that generate some or all of the…

Machine Learning · Computer Science 2025-03-04 Jeffrey Gu , Serena Yeung-Levy

This work focuses on the analysis of fully connected feed forward ReLU neural networks as they approximate a given, smooth function. In contrast to conventionally studied universal approximation properties under increasing architectures,…

Machine Learning · Computer Science 2024-06-24 Erion Morina , Martin Holler

Metanetworks are neural architectures designed to operate directly on pretrained weights to perform downstream tasks. However, the parameter space serves only as a proxy for the underlying function class, and the parameter-function mapping…

Machine Learning · Computer Science 2026-04-28 Viet-Hoang Tran , An Nguyen , Benoît Guérand , Thieu N. Vo , Tan M. Nguyen

Implicit neural networks have become increasingly attractive in the machine learning community since they can achieve competitive performance but use much less computational resources. Recently, a line of theoretical works established the…

Machine Learning · Computer Science 2022-10-03 Tianxiang Gao , Hongyang Gao
‹ Prev 1 2 3 10 Next ›