English
Related papers

Related papers: Growing Neural Network with Shared Parameter

200 papers

We develop an approach to efficiently grow neural networks, within which parameterization and optimization strategies are designed by considering their effects on the training dynamics. Unlike existing growing methods, which follow simple…

Machine Learning · Computer Science 2023-06-23 Xin Yuan , Pedro Savarese , Michael Maire

State-of-the-art parameter-efficient fine-tuning methods rely on introducing adapter modules between the layers of a pretrained language model. However, such modules are trained separately for each task and thus do not enable sharing…

Computation and Language · Computer Science 2021-06-09 Rabeeh Karimi Mahabadi , Sebastian Ruder , Mostafa Dehghani , James Henderson

This paper explores learned-context neural networks. It is a multi-task learning architecture based on a fully shared neural network and an augmented input vector containing trainable task parameters. The architecture is interesting due to…

Machine Learning · Computer Science 2025-08-07 Anders T. Sandnes , Bjarne Grimstad , Odd Kolbjørnsen

Parameter sharing, as an important technique in multi-agent systems, can effectively solve the scalability issue in large-scale agent problems. However, the effectiveness of parameter sharing largely depends on the environment setting. When…

Artificial Intelligence · Computer Science 2025-03-04 Dapeng Li , Na Lou , Bin Zhang , Zhiwei Xu , Guoliang Fan

Most deep neural networks are trained under fixed network architectures and require retraining when the architecture changes. If expanding the network's size is needed, it is necessary to retrain from scratch, which is expensive. To avoid…

Machine Learning · Computer Science 2023-11-09 Chau Pham , Piotr Teterwak , Soren Nelson , Bryan A. Plummer

The width of a neural network matters since increasing the width will necessarily increase the model capacity. However, the performance of a network does not improve linearly with the width and soon gets saturated. In this case, we argue…

Computer Vision and Pattern Recognition · Computer Science 2022-09-07 Shuai Zhao , Liguang Zhou , Wenxiao Wang , Deng Cai , Tin Lun Lam , Yangsheng Xu

This work demonstrates how increasing the number of neurons in a network without increasing its number of non-zero parameters improves performance. We show that this gain corresponds with a decrease in interference between multiple features…

Machine Learning · Computer Science 2025-10-07 Linghao Kong , Inimai Subramanian , Yonadav Shavit , Micah Adler , Dan Alistarh , Nir Shavit

Neural network based methods have obtained great progress on a variety of natural language processing tasks. However, in most previous works, the models are learned based on single-task supervised objectives, which often suffer from…

Computation and Language · Computer Science 2016-05-18 Pengfei Liu , Xipeng Qiu , Xuanjing Huang

In this paper, we introduce a novel concept for learning of the parameters in a neural network. Our idea is grounded on modeling a learning problem that addresses a trade-off between (i) satisfying local objectives at each node and (ii)…

Machine Learning · Computer Science 2019-02-04 Dimche Kostadinov , Behrooz Razdehi , Slava Voloshynovskiy

We introduce a parameter sharing scheme, in which different layers of a convolutional neural network (CNN) are defined by a learned linear combination of parameter tensors from a global bank of templates. Restricting the number of templates…

Machine Learning · Computer Science 2019-03-15 Pedro Savarese , Michael Maire

Training neural networks requires increasing amounts of memory. Parameter sharing can reduce memory and communication costs, but existing methods assume networks have many identical layers and utilize hand-crafted sharing strategies that…

Machine Learning · Computer Science 2022-03-17 Bryan A. Plummer , Nikoli Dryden , Julius Frost , Torsten Hoefler , Kate Saenko

We consider the problem of training a machine learning model over a network of nodes in a fully decentralized framework. The nodes take a Bayesian-like approach via the introduction of a belief over the model parameter space. We propose a…

Machine Learning · Computer Science 2019-02-01 Anusha Lalitha , Osman Cihan Kilinc , Tara Javidi , Farinaz Koushanfar

Parameter sharing has proven to be a parameter-efficient approach. Previous work on Transformers has focused on sharing parameters in different layers, which can improve the performance of models with limited parameters by increasing model…

Machine Learning · Computer Science 2023-06-19 Ye Lin , Mingxuan Wang , Zhexi Zhang , Xiaohui Wang , Tong Xiao , Jingbo Zhu

Task-incremental learning involves the challenging problem of learning new tasks continually, without forgetting past knowledge. Many approaches address the problem by expanding the structure of a shared neural network as tasks arrive, but…

Machine Learning · Computer Science 2020-11-24 Azhar Shaikh , Nishant Sinha

When fine-tuning Deep Neural Networks (DNNs) to new data, DNNs are prone to overwriting network parameters required for task-specific functionality on previously learned tasks, resulting in a loss of performance on those tasks. We propose…

Machine Learning · Computer Science 2025-01-22 Christopher Angelini , Nidhal Bouaynaya

In most applications of utilizing neural networks for mathematical optimization, a dedicated model is trained for each specific optimization objective. However, in many scenarios, several distinct yet correlated objectives or tasks often…

Machine Learning · Computer Science 2024-04-15 Wei Cui , Wei Yu

Most existing deep multi-task learning models are based on parameter sharing, such as hard sharing, hierarchical sharing, and soft sharing. How choosing a suitable sharing mechanism depends on the relations among the tasks, which is not…

Computation and Language · Computer Science 2019-11-19 Tianxiang Sun , Yunfan Shao , Xiaonan Li , Pengfei Liu , Hang Yan , Xipeng Qiu , Xuanjing Huang

In this paper we present an alternative strategy for fine-tuning the parameters of a network. We named the technique Gradual Tuning. Once trained on a first task, the network is fine-tuned on a second task by modifying a progressively…

Artificial Intelligence · Computer Science 2017-11-29 Guglielmo Montone , J. Kevin O'Regan , Alexander V. Terekhov

This paper proposes a novel learning method for multi-task applications. Multi-task neural networks can learn to transfer knowledge across different tasks by using parameter sharing. However, sharing parameters between unrelated tasks can…

Machine Learning · Computer Science 2020-07-21 Krzysztof Maziarz , Efi Kokiopoulou , Andrea Gesmundo , Luciano Sbaiz , Gabor Bartok , Jesse Berent

In multilingual neural machine translation, it has been shown that sharing a single translation model between multiple languages can achieve competitive performance, sometimes even leading to performance gains over bilingually trained…

Computation and Language · Computer Science 2018-09-14 Devendra Singh Sachan , Graham Neubig
‹ Prev 1 2 3 10 Next ›