English
Related papers

Related papers: Function-Space Learning Rates

200 papers

Sequential learning paradigms pose challenges for gradient-based deep learning due to difficulties incorporating new data and retaining prior knowledge. While Gaussian processes elegantly tackle these problems, they struggle with…

Machine Learning · Statistics 2024-03-19 Aidan Scannell , Riccardo Mereu , Paul Chang , Ella Tamir , Joni Pajarinen , Arno Solin

Reinforcement learning (RL) is well known for requiring large amounts of data in order for RL agents to learn to perform complex tasks. Recent progress in model-based RL allows agents to be much more data-efficient, as it enables them to…

Machine Learning · Computer Science 2021-08-17 Remo Sasso , Matthia Sabatelli , Marco A. Wiering

Trainable layers such as convolutional building blocks are the standard network design choices by learning parameters to capture the global context through successive spatial operations. When designing an efficient network, trainable layers…

Computer Vision and Pattern Recognition · Computer Science 2022-03-22 Dongyoon Han , YoungJoon Yoo , Beomyoung Kim , Byeongho Heo

Feature learning is widely regarded as the key mechanism distinguishing neural networks from fixed-kernel methods, yet its impact on the induced function space remains poorly understood. In this work, we precisely characterize how the…

Machine Learning · Statistics 2026-05-19 João Lobo , Bruno Loureiro , Long Tran-Than , Fanghui Liu

Low-rank metric learning aims to learn better discrimination of data subject to low-rank constraints. It keeps the intrinsic low-rank structure of datasets and reduces the time cost and memory usage in metric learning. However, it is still…

Machine Learning · Computer Science 2019-09-16 Han Liu , Zhizhong Han , Yu-Shen Liu , Ming Gu

Learning from small amounts of labeled data is a challenge in the area of deep learning. This is currently addressed by Transfer Learning where one learns the small data set as a transfer task from a larger source dataset. Transfer Learning…

Computer Vision and Pattern Recognition · Computer Science 2018-07-31 Parijat Dube , Bishwaranjan Bhattacharjee , Elisabeth Petit-Bois , Matthew Hill

Meta-learning consists in learning learning algorithms. We use a Long Short Term Memory (LSTM) based network to learn to compute on-line updates of the parameters of another neural network. These parameters are stored in the cell state of…

Machine Learning · Computer Science 2016-10-20 Tom Bosc

Trans-dimensional random field language models (TRF LMs) where sentences are modeled as a collection of random fields, have shown close performance with LSTM LMs in speech recognition and are computationally more efficient in inference.…

Computation and Language · Computer Science 2017-10-31 Bin Wang , Zhijian Ou

Large language models (LLMs) are trained for downstream tasks by updating their parameters (e.g., via RL). However, updating parameters forces them to absorb task-specific information, which can result in catastrophic forgetting and loss of…

We introduce the Fourier Learning Machine (FLM), a neural network (NN) architecture designed to represent a multidimensional nonharmonic Fourier series. The FLM uses a simple feedforward structure with cosine activation functions to learn…

Machine Learning · Computer Science 2026-03-20 Mominul Rubel , Adam Meyers , Gabriel Nicolosi

In recent years, Large Language Models (LLMs) through Transformer structures have dominated many machine learning tasks, especially text processing. However, these models require massive amounts of data for training and induce high resource…

Machine Learning · Computer Science 2025-04-17 Kilian Pfeiffer , Mohamed Aboelenien Ahmed , Ramin Khalili , Jörg Henkel

State-of-the-art rehearsal-free continual learning methods exploit the peculiarities of Vision Transformers to learn task-specific prompts, drastically reducing catastrophic forgetting. However, there is a tradeoff between the number of…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Thomas De Min , Massimiliano Mancini , Karteek Alahari , Xavier Alameda-Pineda , Elisa Ricci

We propose a simple architecture for deep reinforcement learning by embedding inputs into a learned Fourier basis and show that it improves the sample efficiency of both state-based and image-based RL. We perform infinite-width analysis of…

Machine Learning · Computer Science 2021-12-07 Alexander C. Li , Deepak Pathak

We propose layer saturation - a simple, online-computable method for analyzing the information processing in neural networks. First, we show that a layer's output can be restricted to the eigenspace of its variance matrix without…

Machine Learning · Computer Science 2021-11-23 Mats L. Richter , Justin Shenk , Wolf Byttner , Anders Arpteg , Mikael Huss

The use of transfer learning (TL) techniques has become common practice in fields such as computer vision (CV) and natural language processing (NLP). Leveraging prior knowledge gained from data with different distributions, TL offers higher…

Signal Processing · Electrical Eng. & Systems 2022-06-17 Lauren J. Wong , Sean McPherson , Alan J. Michaels

The function space of deep-learning machines is investigated by studying growth in the entropy of functions of a given error with respect to a reference function, realized by a deep-learning machine. Using physics-inspired methods we study…

Disordered Systems and Neural Networks · Physics 2018-08-10 Bo Li , David Saad

The push to train ever larger neural networks has motivated the study of initialization and training at large network width. A key challenge is to scale training so that a network's internal representations evolve nontrivially at all…

Machine Learning · Computer Science 2024-05-15 Greg Yang , James B. Simon , Jeremy Bernstein

We study the relationship between the frequency of a function and the speed at which a neural network learns it. We build on recent results that show that the dynamics of overparameterized neural networks trained with gradient descent can…

Machine Learning · Computer Science 2019-12-03 Ronen Basri , David Jacobs , Yoni Kasten , Shira Kritchman

Many approximations were suggested to circumvent the cubic complexity of kernel-based algorithms, allowing their application to large-scale datasets. One strategy is to consider the primal formulation of the learning problem by mapping the…

Machine Learning · Computer Science 2025-12-03 Albert Saiapin , Kim Batselier

In continual learning, networks confront a trade-off between stability and plasticity when trained on a sequence of tasks. To bolster plasticity without sacrificing stability, we propose a novel training algorithm called LRFR. This approach…

Machine Learning · Computer Science 2023-12-15 Zhenrong Liu , Yang Li , Yi Gong , Yik-Chung Wu
‹ Prev 1 2 3 10 Next ›