Related papers: Memory-based Parameter Adaptation

Accelerated Training via Incrementally Growing Neural Networks using Variance Transfer and Learning Rate Adaptation

We develop an approach to efficiently grow neural networks, within which parameterization and optimization strategies are designed by considering their effects on the training dynamics. Unlike existing growing methods, which follow simple…

Machine Learning · Computer Science 2023-06-23 Xin Yuan , Pedro Savarese , Michael Maire

DecisiveNets: Training Deep Associative Memories to Solve Complex Machine Learning Problems

Learning deep representations to solve complex machine learning tasks has become the prominent trend in the past few years. Indeed, Deep Neural Networks are now the golden standard in domains as various as computer vision, natural language…

Machine Learning · Computer Science 2020-12-04 Vincent Gripon , Carlos Lassance , Ghouthi Boukli Hacene

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question…

Computation and Language · Computer Science 2021-09-06 Paul Michel

Sparse Meta Networks for Sequential Adaptation and its Application to Adaptive Language Modelling

Training a deep neural network requires a large amount of single-task data and involves a long time-consuming optimization phase. This is not scalable to complex, realistic environments with new unexpected changes. Humans can perform fast…

Neural and Evolutionary Computing · Computer Science 2020-09-04 Tsendsuren Munkhdalai

Neural networks with image recognition by pairs

Neural networks based on metric recognition methods have a strictly determined architecture. Number of neurons, connections, as well as weights and thresholds values are calculated analytically, based on the initial conditions of tasks:…

Neural and Evolutionary Computing · Computer Science 2025-06-10 Polad Geidarov

Mildly Overparametrized Neural Nets can Memorize Training Data Efficiently

It has been observed \citep{zhang2016understanding} that deep neural networks can memorize: they achieve 100\% accuracy on training data. Recent theoretical results explained such behavior in highly overparametrized regimes, where the…

Machine Learning · Computer Science 2019-09-27 Rong Ge , Runzhe Wang , Haoyu Zhao

Continual Learning of Context-dependent Processing in Neural Networks

Deep neural networks (DNNs) are powerful tools in learning sophisticated but fixed mapping rules between inputs and outputs, thereby limiting their application in more complex and dynamic situations in which the mapping rules are not kept…

Machine Learning · Computer Science 2021-06-29 Guanxiong Zeng , Yang Chen , Bo Cui , Shan Yu

Concept Learning through Deep Reinforcement Learning with Memory-Augmented Neural Networks

Deep neural networks have shown superior performance in many regimes to remember familiar patterns with large amounts of data. However, the standard supervised deep learning paradigm is still limited when facing the need to learn new…

Machine Learning · Computer Science 2018-11-16 Jing Shi , Jiaming Xu , Yiqun Yao , Bo Xu

Neural Priming for Sample-Efficient Adaptation

We propose Neural Priming, a technique for adapting large pretrained models to distribution shifts and downstream tasks given few or no labeled examples. Presented with class names or unlabeled test samples, Neural Priming enables the model…

Machine Learning · Computer Science 2023-12-06 Matthew Wallingford , Vivek Ramanujan , Alex Fang , Aditya Kusupati , Roozbeh Mottaghi , Aniruddha Kembhavi , Ludwig Schmidt , Ali Farhadi

Continual Learning with Pretrained Backbones by Tuning in the Input Space

The intrinsic difficulty in adapting deep learning models to non-stationary environments limits the applicability of neural networks to real-world tasks. This issue is critical in practical supervised learning settings, such as the ones in…

Machine Learning · Computer Science 2023-06-09 Simone Marullo , Matteo Tiezzi , Marco Gori , Stefano Melacci , Tinne Tuytelaars

Domain Adaptive Transfer Learning with Specialist Models

Transfer learning is a widely used method to build high performing computer vision models. In this paper, we study the efficacy of transfer learning by examining how the choice of data impacts performance. We find that more pre-training…

Computer Vision and Pattern Recognition · Computer Science 2018-12-13 Jiquan Ngiam , Daiyi Peng , Vijay Vasudevan , Simon Kornblith , Quoc V. Le , Ruoming Pang

Neural Networks for Parameter Estimation in Intractable Models

We propose to use deep learning to estimate parameters in statistical models when standard likelihood estimation methods are computationally infeasible. We show how to estimate parameters from max-stable processes, where inference is…

Methodology · Statistics 2021-08-02 Amanda Lenzi , Julie Bessac , Johann Rudi , Michael L. Stein

Meta-learning of Sequential Strategies

In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual…

Machine Learning · Computer Science 2019-07-22 Pedro A. Ortega , Jane X. Wang , Mark Rowland , Tim Genewein , Zeb Kurth-Nelson , Razvan Pascanu , Nicolas Heess , Joel Veness , Alex Pritzel , Pablo Sprechmann , Siddhant M. Jayakumar , Tom McGrath , Kevin Miller , Mohammad Azar , Ian Osband , Neil Rabinowitz , András György , Silvia Chiappa , Simon Osindero , Yee Whye Teh , Hado van Hasselt , Nando de Freitas , Matthew Botvinick , Shane Legg

AdapterNet - learning input transformation for domain adaptation

Deep neural networks have demonstrated impressive performance in various machine learning tasks. However, they are notoriously sensitive to changes in data distribution. Often, even a slight change in the distribution can lead to drastic…

Computer Vision and Pattern Recognition · Computer Science 2018-11-16 Alon Hazan , Yoel Shoshan , Daniel Khapun , Roy Aladjem , Vadim Ratner

A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay

Although deep learning has produced dazzling successes for applications of image, speech, and video processing in the past few years, most trainings are with suboptimal hyper-parameters, requiring unnecessarily long training times. Setting…

Machine Learning · Computer Science 2018-04-25 Leslie N. Smith

Self-adaptive weights based on balanced residual decay rate for physics-informed neural networks and deep operator networks

Physics-informed deep learning has emerged as a promising alternative for solving partial differential equations. However, for complex problems, training these networks can still be challenging, often resulting in unsatisfactory accuracy…

Machine Learning · Computer Science 2025-09-18 Wenqian Chen , Amanda A. Howard , Panos Stinis

Deep Learning in Target Space

Deep learning uses neural networks which are parameterised by their weights. The neural networks are usually trained by tuning the weights to directly minimise a given loss function. In this paper we propose to re-parameterise the weights…

Neural and Evolutionary Computing · Computer Science 2022-03-14 Michael Fairbank , Spyridon Samothrakis , Luca Citi

Introspection: Accelerating Neural Network Training By Learning Weight Evolution

Neural Networks are function approximators that have achieved state-of-the-art accuracy in numerous machine learning tasks. In spite of their great success in terms of accuracy, their large training time makes it difficult to use them for…

Machine Learning · Computer Science 2017-04-18 Abhishek Sinha , Mausoom Sarkar , Aahitagni Mukherjee , Balaji Krishnamurthy

Deep supervised learning using local errors

Error backpropagation is a highly effective mechanism for learning high-quality hierarchical features in deep networks. Updating the features or weights in one layer, however, requires waiting for the propagation of error signals from…

Neural and Evolutionary Computing · Computer Science 2017-11-21 Hesham Mostafa , Vishwajith Ramesh , Gert Cauwenberghs

Adapting by Pruning: A Case Study on BERT

Adapting pre-trained neural models to downstream tasks has become the standard practice for obtaining high-quality models. In this work, we propose a novel model adaptation paradigm, adapting by pruning, which prunes neural connections in…

Machine Learning · Computer Science 2021-05-10 Yang Gao , Nicolo Colombo , Wei Wang