Related papers: Rethinking Parameter Sharing as Graph Coloring for…

Rethinking Hard-Parameter Sharing in Multi-Domain Learning

Hard parameter sharing in multi-domain learning (MDL) allows domains to share some of the model parameters to reduce storage cost while improving prediction accuracy. One common sharing practice is to share the bottom layers of a deep…

Machine Learning · Computer Science 2022-03-22 Lijun Zhang , Qizheng Yang , Xiao Liu , Hui Guan

Understanding Parameter Sharing in Transformers

Parameter sharing has proven to be a parameter-efficient approach. Previous work on Transformers has focused on sharing parameters in different layers, which can improve the performance of models with limited parameters by increasing model…

Machine Learning · Computer Science 2023-06-19 Ye Lin , Mingxuan Wang , Zhexi Zhang , Xiaohui Wang , Tong Xiao , Jingbo Zhu

Learning Implicitly Recurrent CNNs Through Parameter Sharing

We introduce a parameter sharing scheme, in which different layers of a convolutional neural network (CNN) are defined by a learned linear combination of parameter tensors from a global bank of templates. Restricting the number of templates…

Machine Learning · Computer Science 2019-03-15 Pedro Savarese , Michael Maire

DynaShare: Task and Instance Conditioned Parameter Sharing for Multi-Task Learning

Multi-task networks rely on effective parameter sharing to achieve robust generalization across tasks. In this paper, we present a novel parameter sharing method for multi-task learning that conditions parameter sharing on both the task and…

Computer Vision and Pattern Recognition · Computer Science 2023-05-30 Elahe Rahimian , Golara Javadi , Frederick Tung , Gabriel Oliveira

Learning Sparse Sharing Architectures for Multiple Tasks

Most existing deep multi-task learning models are based on parameter sharing, such as hard sharing, hierarchical sharing, and soft sharing. How choosing a suitable sharing mechanism depends on the relations among the tasks, which is not…

Computation and Language · Computer Science 2019-11-19 Tianxiang Sun , Yunfan Shao , Xiaonan Li , Pengfei Liu , Hang Yan , Xipeng Qiu , Xuanjing Huang

On the Descriptive Complexity of Color Coding

Color coding is an algorithmic technique used in parameterized complexity theory to detect "small" structures inside graphs. The idea is to derandomize algorithms that first randomly color a graph and then search for an easily-detectable,…

Computational Complexity · Computer Science 2019-01-14 Max Bannach , Till Tantau

Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks

Modern convolutional neural networks (CNNs) have massive identical convolution blocks, and, hence, recursive sharing of parameters across these blocks has been proposed to reduce the amount of parameters. However, naive sharing of…

Computer Vision and Pattern Recognition · Computer Science 2021-11-23 Woochul Kang , Daeyeon Kim

Drastically Reducing the Number of Trainable Parameters in Deep CNNs by Inter-layer Kernel-sharing

Deep convolutional neural networks (DCNNs) have become the state-of-the-art (SOTA) approach for many computer vision tasks: image classification, object detection, semantic segmentation, etc. However, most SOTA networks are too large for…

Computer Vision and Pattern Recognition · Computer Science 2022-10-26 Alireza Azadbakht , Saeed Reza Kheradpisheh , Ismail Khalfaoui-Hassani , Timothée Masquelier

Symbolic Regression for Shared Expressions: Introducing Partial Parameter Sharing

Symbolic regression (SR) aims to find symbolic expressions that describe datasets. Due to its inherent interpretability, is a powerful paradigm for scientific discovery. Recent advances have expanded SR to describe related phenomena using a…

Machine Learning · Computer Science 2026-03-31 Viktor Martinek , Roland Herzog

Lessons on Parameter Sharing across Layers in Transformers

We propose a parameter sharing method for Transformers (Vaswani et al., 2017). The proposed approach relaxes a widely used technique, which shares parameters for one layer with all layers such as Universal Transformers (Dehghani et al.,…

Computation and Language · Computer Science 2023-06-05 Sho Takase , Shun Kiyono

Fractional Graph Coloring for Functional Compression with Side Information

We describe a rational approach to reduce the computational and communication complexities of lossless point-to-point compression for computation with side information. The traditional method relies on building a characteristic graph with…

Information Theory · Computer Science 2022-06-07 Derya Malak

Compression versus Accuracy: A Hierarchy of Lifted Models

Probabilistic graphical models that encode indistinguishable objects and relations among them use first-order logic constructs to compress a propositional factorised model for more efficient (lifted) inference. To obtain a lifted…

Artificial Intelligence · Computer Science 2025-09-01 Jan Speller , Malte Luttermann , Marcel Gehrke , Tanya Braun

Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers

Transformers have shown improved performance when compared to previous architectures for sequence processing such as RNNs. Despite their sizeable performance gains, as recently suggested, the model is computationally expensive to train and…

Computation and Language · Computer Science 2021-09-09 Machel Reid , Edison Marrese-Taylor , Yutaka Matsuo

On Information Geometry and Iterative Optimization in Model Compression: Operator Factorization

The ever-increasing parameter counts of deep learning models necessitate effective compression techniques for deployment on resource-constrained devices. This paper explores the application of information geometry, the study of…

Machine Learning · Computer Science 2025-07-15 Zakhar Shumaylov , Vasileios Tsiaras , Yannis Stylianou

Sharing Parameter by Conjugation for Knowledge Graph Embeddings in Complex Space

A Knowledge Graph (KG) is the directed graphical representation of entities and relations in the real world. KG can be applied in diverse Natural Language Processing (NLP) tasks where knowledge is required. The need to scale up and complete…

Computation and Language · Computer Science 2024-04-19 Xincan Feng , Zhi Qu , Yuchang Cheng , Taro Watanabe , Nobuhiro Yugami

Learning Fine-grained Parameter Sharing via Sparse Tensor Decomposition

Large neural networks achieve state-of-the-art performance on many tasks, yet their sheer size hinders deployment on resource-constrained devices. Among existing compression approaches, cross-layer parameter sharing remains relatively…

Machine Learning · Computer Science 2026-05-26 Cem Üyük , Mike Lasby , Mohamed Yassin , Utku Evci , Yani Ioannou

Chromatic Learning for Sparse Datasets

Learning over sparse, high-dimensional data frequently necessitates the use of specialized methods such as the hashing trick. In this work, we design a highly scalable alternative approach that leverages the low degree of feature…

Machine Learning · Statistics 2020-06-09 Vladimir Feinberg , Peter Bailis

Color: A Framework for Applying Graph Coloring to Subgraph Cardinality Estimation

Graph workloads pose a particularly challenging problem for query optimizers. They typically feature large queries made up of entirely many-to-many joins with complex correlations. This puts significant stress on traditional cardinality…

Databases · Computer Science 2025-05-01 Kyle Deeds , Diandre Sabale , Moe Kayali , Dan Suciu

Geometric compression for progressive transmission

The compression of geometric structures is a relatively new field of data compression. Since about 1995, several articles have dealt with the coding of meshes, using for most of them the following approach: the vertices of the mesh are…

Computational Geometry · Computer Science 2007-05-23 Olivier Devillers , Pierre-Maris Gandoin

Parameter-wise co-clustering for high-dimensional data

In recent years, data dimensionality has increasingly become a concern, leading to many parameter and dimension reduction techniques being proposed in the literature. A parameter-wise co-clustering model, for data modelled via continuous…

Machine Learning · Statistics 2020-10-01 M. P. B. Gallaugher , C. Biernacki , P. D. McNicholas