English
Related papers

Related papers: Layerwise Optimization by Gradient Decomposition f…

200 papers

In neural networks, continual learning results in gradient interference among sequential tasks, leading to catastrophic forgetting of old tasks while learning new ones. This issue is addressed in recent methods by storing the important…

Machine Learning · Computer Science 2023-02-06 Gobinda Saha , Kaushik Roy

Neural networks are achieving state of the art and sometimes super-human performance on learning tasks across a variety of domains. Whenever these problems require learning in a continual or sequential manner, however, neural networks…

Machine Learning · Computer Science 2019-10-17 Mehrdad Farajtabar , Navid Azizan , Alex Mott , Ang Li

The ability to learn continually without forgetting the past tasks is a desired attribute for artificial learning systems. Existing approaches to enable such learning in artificial neural networks usually rely on network growth, importance…

Machine Learning · Computer Science 2021-03-18 Gobinda Saha , Isha Garg , Kaushik Roy

Multi-task learning (MTL) aims at solving multiple related tasks simultaneously and has experienced rapid growth in recent years. However, MTL models often suffer from performance degeneration with negative transfer due to learning several…

Machine Learning · Computer Science 2023-02-01 Xin Dong , Ruize Wu , Chao Xiong , Hai Li , Lei Cheng , Yong He , Shiyou Qian , Jian Cao , Linjian Mo

The current deep learning model is of a single-grade, that is, it learns a deep neural network by solving a single nonconvex optimization problem. When the layer number of the neural network is large, it is computationally challenging to…

Machine Learning · Computer Science 2023-02-02 Yuesheng Xu

Learning an efficient update rule from data that promotes rapid learning of new tasks from the same distribution remains an open problem in meta-learning. Typically, previous works have approached this issue either by attempting to train a…

Machine Learning · Computer Science 2020-02-19 Sebastian Flennerhag , Andrei A. Rusu , Razvan Pascanu , Francesco Visin , Hujun Yin , Raia Hadsell

Multitask learning is a methodology to boost generalization performance and also reduce computational intensity and memory usage. However, learning multiple tasks simultaneously can be more difficult than learning a single task because it…

Machine Learning · Computer Science 2020-06-03 Sungjae Lee , Youngdoo Son

Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and…

Machine Learning · Computer Science 2017-06-13 Kaifeng Lv , Shunhua Jiang , Jian Li

Intermediate features at different layers of a deep neural network are known to be discriminative for visual patterns of different complexities. However, most existing works ignore such cross-layer heterogeneities when classifying samples…

Computer Vision and Pattern Recognition · Computer Science 2016-07-20 Xiaojie Jin , Yunpeng Chen , Jian Dong , Jiashi Feng , Shuicheng Yan

We propose a new technique that boosts the convergence of training generative adversarial networks. Generally, the rate of training deep models reduces severely after multiple iterations. A key reason for this phenomenon is that a deep…

Machine Learning · Statistics 2018-06-15 Atsushi Nitanda , Taiji Suzuki

Deep neural networks are a promising approach towards multi-task learning because of their capability to leverage knowledge across domains and learn general purpose representations. Nevertheless, they can fail to live up to these promises…

Machine Learning · Computer Science 2019-12-17 Mihai Suteu , Yike Guo

As deep learning models and datasets rapidly scale up, network training is extremely time-consuming and resource-costly. Instead of training on the entire dataset, learning with a small synthetic dataset becomes an efficient solution.…

Machine Learning · Computer Science 2022-08-02 Zixuan Jiang , Jiaqi Gu , Mingjie Liu , David Z. Pan

In this paper we introduce a novel method of gradient normalization and decay with respect to depth. Our method leverages the simple concept of normalizing all gradients in a deep neural network, and then decaying said gradients with…

Machine Learning · Computer Science 2018-03-01 Robert Kwiatkowski , Oscar Chang

Continual learning in deep neural networks often suffers from catastrophic forgetting, where representations for previous tasks are overwritten during subsequent training. We propose a novel sample retrieval strategy from the memory buffer…

Machine Learning · Computer Science 2024-12-20 Hongye Xu , Jan Wasilewski , Bartosz Krawczyk

Continual learning (CL) presents a fundamental challenge in training neural networks on sequential tasks without experiencing catastrophic forgetting. Traditionally, the dominant approach in CL has been gradient-based optimization, where…

Machine Learning · Computer Science 2025-04-03 Grzegorz Rypeść

Multilevel optimization has gained renewed interest in machine learning due to its promise in applications such as hyperparameter tuning and continual learning. However, existing methods struggle with the inherent difficulty of efficiently…

Machine Learning · Computer Science 2024-10-16 Yuntian Gu , Xuzheng Chen

Topological learning is a wide research area aiming at uncovering the mutual spatial relationships between the elements of a set. Some of the most common and oldest approaches involve the use of unsupervised competitive neural networks.…

Machine Learning · Statistics 2021-11-03 Pietro Barbiero , Gabriele Ciravegna , Vincenzo Randazzo , Giansalvo Cirrincione

In deep multi-task learning, weights of task-specific networks are shared between tasks to improve performance on each single one. Since the question, which weights to share between layers, is difficult to answer, human-designed…

Machine Learning · Computer Science 2020-03-24 Jonas Prellberg , Oliver Kramer

Multilingual models jointly pretrained on multiple languages have achieved remarkable performance on various multilingual downstream tasks. Moreover, models finetuned on a single monolingual downstream task have shown to generalize to…

Computation and Language · Computer Science 2022-03-01 Seanie Lee , Hae Beom Lee , Juho Lee , Sung Ju Hwang

Training deep neural networks on large datasets containing high-dimensional data requires a large amount of computation. A solution to this problem is data-parallel distributed training, where a model is replicated into several…

Machine Learning · Computer Science 2021-03-18 Lusine Abrahamyan , Yiming Chen , Giannis Bekoulis , Nikos Deligiannis
‹ Prev 1 2 3 10 Next ›