English
Related papers

Related papers: Every Model Learned by Gradient Descent Is Approxi…

200 papers

Significant theoretical work has established that in specific regimes, neural networks trained by gradient descent behave like kernel methods. However, in practice, it is known that neural networks strongly outperform their associated…

Machine Learning · Computer Science 2022-07-01 Alex Damian , Jason D. Lee , Mahdi Soltanolkotabi

Deep networks are often considered to be more expressive than shallow ones in terms of approximation. Indeed, certain functions can be approximated by deep networks provably more efficiently than by shallow ones, however, no tractable…

Machine Learning · Statistics 2021-08-27 Alberto Bietti , Francis Bach

Deep kernel learning provides an elegant and principled framework for combining the structural properties of deep learning algorithms with the flexibility of kernel methods. By means of a deep neural network, we learn a parametrized kernel…

Machine Learning · Computer Science 2020-12-14 Prudencio Tossou , Basile Dura , Francois Laviolette , Mario Marchand , Alexandre Lacoste

Deep learning has shown promising results in many machine learning applications. The hierarchical feature representation built by deep networks enable compact and precise encoding of the data. A kernel analysis of the trained deep networks…

Machine Learning · Computer Science 2017-03-22 Mandar Kulkarni , Shirish Karande

An open question in the Deep Learning community is why neural networks trained with Gradient Descent generalize well on real datasets even though they are capable of fitting random data. We propose an approach to answering this question…

Machine Learning · Computer Science 2020-02-26 Satrajit Chatterjee

In recent years neural networks have achieved impressive results on many technological and scientific tasks. Yet, the mechanism through which these models automatically select features, or patterns in data, for prediction remains unclear.…

Machine Learning · Computer Science 2023-05-11 Adityanarayanan Radhakrishnan , Daniel Beaglehole , Parthe Pandit , Mikhail Belkin

We study the relative power of learning with gradient descent on differentiable models, such as neural networks, versus using the corresponding tangent kernels. We show that under certain conditions, gradient descent achieves small error…

Machine Learning · Computer Science 2021-03-02 Eran Malach , Pritish Kamath , Emmanuel Abbe , Nathan Srebro

The ability of learning useful features is one of the major advantages of neural networks. Although recent works show that neural network can operate in a neural tangent kernel (NTK) regime that does not allow feature learning, many works…

Machine Learning · Computer Science 2024-11-06 Mo Zhou , Rong Ge

In this article a surprising result is demonstrated using the neural tangent kernel. This kernel is defined as the inner product of the vector of the gradient of an underlying model evaluated at training points. This kernel is used to…

Artificial Intelligence · Computer Science 2021-04-14 Matt Calder

It is well understood that neural networks with carefully hand-picked weights provide powerful function approximation and that they can be successfully trained in over-parametrized regimes. Since over-parametrization ensures zero training…

Machine Learning · Computer Science 2024-05-21 G. Welper

Several recent works have shown separation results between deep neural networks, and hypothesis classes with inferior approximation capacity such as shallow networks or kernel classes. On the other hand, the fact that deep networks can…

Machine Learning · Computer Science 2021-07-20 Eran Malach , Gilad Yehudai , Shai Shalev-Shwartz , Ohad Shamir

The impressive practical performance of neural networks is often attributed to their ability to learn low-dimensional data representations and hierarchical structure directly from data. In this work, we argue that these two phenomena are…

Machine Learning · Statistics 2025-10-06 Libin Zhu , Damek Davis , Dmitriy Drusvyatskiy , Maryam Fazel

One of the most important parts of Artificial Neural Networks is minimizing the loss functions which tells us how good or bad our model is. To minimize these losses we need to tune the weights and biases. Also to calculate the minimum value…

Machine Learning · Computer Science 2021-01-08 Kaustubh Yadav

Deep kernel learning aims at designing nonlinear combinations of multiple standard elementary kernels by training deep networks. This scheme has proven to be effective, but intractable when handling large-scale datasets especially when the…

Computer Vision and Pattern Recognition · Computer Science 2018-05-01 Mingyuan Jiu , Hichem Sahbi

Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due…

Neural and Evolutionary Computing · Computer Science 2015-04-20 Tomas Mikolov , Armand Joulin , Sumit Chopra , Michael Mathieu , Marc'Aurelio Ranzato

Deep learning algorithms demonstrate a surprising ability to learn high-dimensional tasks from limited examples. This is commonly attributed to the depth of neural networks, enabling them to build a hierarchy of abstract, low-dimensional…

Machine Learning · Computer Science 2024-07-04 Francesco Cagnetta , Leonardo Petrini , Umberto M. Tomasini , Alessandro Favero , Matthieu Wyart

While gradient descent has proven highly successful in learning connection weights for neural networks, the actual structure of these networks is usually determined by hand, or by other optimization algorithms. Here we describe a simple…

Neural and Evolutionary Computing · Computer Science 2016-08-09 Thomas Miconi

Deep learning methods have predominantly been applied to large artificial neural networks. Despite their state-of-the-art performance, these large networks typically do not generalize well to datasets with limited sample sizes. In this…

Machine Learning · Statistics 2016-11-17 Eric Strobl , Shyam Visweswaran

We address the challenging problem of deep representation learning--the efficient adaption of a pre-trained deep network to different tasks. Specifically, we propose to explore gradient-based features. These features are gradients of the…

Machine Learning · Computer Science 2020-04-14 Fangzhou Mu , Yingyu Liang , Yin Li

It has been observed that design choices of neural networks are often crucial for their successful optimization. In this article, we therefore discuss the question if it is always possible to redesign a neural network so that it trains well…

Machine Learning · Computer Science 2020-07-28 G. Welper
‹ Prev 1 2 3 10 Next ›