English
Related papers

Related papers: Modular Duality in Deep Learning

200 papers

Scaling model capacity has been vital in the success of deep learning. For a typical network, necessary compute resources and training time grow dramatically with model size. Conditional computation is a promising way to increase the number…

Machine Learning · Computer Science 2018-11-14 Louis Kirsch , Julius Kunze , David Barber

Traditional deep network training methods optimize a monolithic objective function jointly for all the components. This can lead to various inefficiencies in terms of potential parallelization. Local learning is an approach to…

Machine Learning · Computer Science 2023-01-19 Adeetya Patel , Michael Eickenberg , Eugene Belilovsky

Training deep neural networks is a challenging non-convex optimization problem. Recent work has proven that the strong duality holds (which means zero duality gap) for regularized finite-width two-layer ReLU networks and consequently…

Machine Learning · Computer Science 2023-03-08 Yifei Wang , Tolga Ergen , Mert Pilanci

Modularity is a general principle present in many fields. It offers attractive advantages, including, among others, ease of conceptualization, interpretability, scalability, module combinability, and module reusability. The deep learning…

Machine Learning · Computer Science 2023-10-03 Haozhe Sun , Isabelle Guyon

Modular neural networks outperform nonmodular neural networks on tasks ranging from visual question answering to robotics. These performance improvements are thought to be due to modular networks' superior ability to model the compositional…

Machine Learning · Computer Science 2025-03-12 Akhilan Boopathy , Sunshine Jiang , William Yue , Jaedong Hwang , Abhiram Iyer , Ila Fiete

The learned weights of a neural network are often considered devoid of scrutable internal structure. To discern structure in these weights, we introduce a measurable notion of modularity for multi-layer perceptrons (MLPs), and investigate…

Neural and Evolutionary Computing · Computer Science 2022-02-09 Daniel Filan , Shlomi Hod , Cody Wild , Andrew Critch , Stuart Russell

The success of deep neural networks is in part due to the use of normalization layers. Normalization layers like Batch Normalization, Layer Normalization and Weight Normalization are ubiquitous in practice, as they improve generalization…

Machine Learning · Computer Science 2020-06-15 Yonatan Dukler , Quanquan Gu , Guido Montúfar

Training a Neural Network (NN) with lots of parameters or intricate architectures creates undesired phenomena that complicate the optimization process. To address this issue we propose a first modular approach to NN design, wherein the NN…

Machine Learning · Computer Science 2019-02-26 David Castillo-Bolado , Cayetano Guerra-Artal , Mario Hernandez-Tejera

Transfer learning has recently become the dominant paradigm of machine learning. Pre-trained models fine-tuned for downstream tasks achieve better performance with fewer labelled examples. Nonetheless, it remains unclear how to develop…

Machine Learning · Computer Science 2024-01-30 Jonas Pfeiffer , Sebastian Ruder , Ivan Vulić , Edoardo Maria Ponti

For most deep learning algorithms training is notoriously time consuming. Since most of the computation in training neural networks is typically spent on floating point multiplications, we investigate an approach to training that eliminates…

Machine Learning · Computer Science 2016-02-29 Zhouhan Lin , Matthieu Courbariaux , Roland Memisevic , Yoshua Bengio

Deep learning has been widely used for supervised learning and classification/regression problems. Recently, a novel area of research has applied this paradigm to unsupervised tasks; indeed, a gradient-based approach extracts, efficiently…

Machine Learning · Statistics 2020-09-08 Giansalvo Cirrincione , Pietro Barbiero , Gabriele Ciravegna , Vincenzo Randazzo

We investigate deep morphological neural networks (DMNNs). We demonstrate that despite their inherent non-linearity, "linear" activations are essential for DMNNs. To preserve their inherent sparsity, we propose architectures that constraint…

Machine Learning · Computer Science 2025-12-24 Konstantinos Fotopoulos , Petros Maragos

Linear layers in neural networks (NNs) trained by gradient descent can be expressed as a key-value memory system which stores all training datapoints and the initial weights, and produces outputs using unnormalised dot attention over the…

Machine Learning · Computer Science 2022-06-20 Kazuki Irie , Róbert Csordás , Jürgen Schmidhuber

Modularity has been widely studied as a mechanism to improve the capabilities of neural networks through various techniques such as hand-crafted modular architectures and automatic approaches. While these methods have sometimes shown…

Neural and Evolutionary Computing · Computer Science 2024-10-28 Humphrey Munn , Marcus Gallagher

To improve performance in contemporary deep learning, one is interested in scaling up the neural network in terms of both the number and the size of the layers. When ramping up the width of a single layer, graceful scaling of training has…

Machine Learning · Computer Science 2024-05-24 Tim Large , Yang Liu , Minyoung Huh , Hyojin Bahng , Phillip Isola , Jeremy Bernstein

Generalization of deep neural networks remains one of the main open problems in machine learning. Previous theoretical works focused on deriving tight bounds of model complexity, while empirical works revealed that neural networks exhibit…

Machine Learning · Computer Science 2022-01-31 James Wang , Cheng-Lin Yang

We propose multirate training of neural networks: partitioning neural network parameters into "fast" and "slow" parts which are trained on different time scales, where slow parts are updated less frequently. By choosing appropriate…

Machine Learning · Computer Science 2022-11-02 Tiffany Vlaar , Benedict Leimkuhler

We present a novel regularization approach to train neural networks that enjoys better generalization and test error than standard stochastic gradient descent. Our approach is based on the principles of cross-validation, where a validation…

Computer Vision and Pattern Recognition · Computer Science 2018-09-06 Simon Jenni , Paolo Favaro

With the accumulation of resources in the era of big data and the rise of pre-trained models in deep learning, optimizing neural networks for various tasks often involves different strategies for fine-tuning pre-trained models versus…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Xin Ning , Qiankun Li , Xiaolong Huang , Qiupu Chen , Feng He , Weijun Li , Prayag Tiwari , Xinwang Liu

In low-latency or mobile applications, lower computation complexity, lower memory footprint and better energy efficiency are desired. Many prior works address this need by removing redundant parameters. Parameter quantization replaces…

Machine Learning · Computer Science 2021-11-16 Cheng-Chou Lan
‹ Prev 1 2 3 10 Next ›