English
Related papers

Related papers: Structured Transforms for Small-Footprint Deep Lea…

200 papers

The low displacement rank (LDR) framework for structured matrices represents a matrix through two displacement operators and a low-rank residual. Existing use of LDR matrices in deep learning has applied fixed displacement operators…

Machine Learning · Computer Science 2019-01-03 Anna T. Thomas , Albert Gu , Tri Dao , Atri Rudra , Christopher Ré

Structured distributions, i.e. distributions over combinatorial spaces, are commonly used to learn latent probabilistic representations from observed data. However, scaling these models is bottlenecked by the high computational and memory…

Computation and Language · Computer Science 2022-01-11 Justin T. Chiu , Yuntian Deng , Alexander M. Rush

The increasing size of neural networks has led to a growing demand for methods of efficient fine-tuning. Recently, an orthogonal fine-tuning paradigm was introduced that uses orthogonal matrices for adapting the weights of a pretrained…

Machine Learning · Computer Science 2024-06-17 Mikhail Gorbunov , Nikolay Yudin , Vera Soboleva , Aibek Alanov , Alexey Naumov , Maxim Rakhuba

With the proliferation of mobile devices and the Internet of Things, deep learning models are increasingly deployed on devices with limited computing resources and memory, and are exposed to the threat of adversarial noise. Learning deep…

Computer Vision and Pattern Recognition · Computer Science 2021-12-30 Xian Wei , Yanhui Huang , Yangyu Xu , Mingsong Chen , Hai Lan , Yuanxiang Li , Zhongfeng Wang , Xuan Tang

State-of-the-art LLMs often rely on scale with high computational costs, which has sparked a research agenda to reduce parameter counts and costs without significantly impacting performance. Our study focuses on Transformer-based LLMs,…

Computation and Language · Computer Science 2024-07-25 Xiuying Wei , Skander Moalla , Razvan Pascanu , Caglar Gulcehre

State-of-the-art results in large language models (LLMs) often rely on scale, which becomes computationally expensive. This has sparked a research agenda to reduce these models' parameter counts and computational costs without significantly…

Computation and Language · Computer Science 2024-11-07 Xiuying Wei , Skander Moalla , Razvan Pascanu , Caglar Gulcehre

Recent efforts to scale Transformer models have demonstrated rapid progress across a wide range of tasks (Wei et al., 2022). However, fine-tuning these models for downstream tasks is expensive due to their large parameter counts.…

This study addresses the challenge of accurately identifying multi-task contention types in high-dimensional system environments and proposes a unified contention classification framework that integrates representation transformation,…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-29 Xiao Yang , Yinan Ni , Yuqi Tang , Zhimin Qiu , Chen Wang , Tingzhou Yuan

Despite their impressive performance, contemporary neural networks often lack structural safeguards that promote stable learning and interpretable behavior. In this work, we introduce a reformulation of layer-level transformations that…

Machine Learning · Computer Science 2025-08-04 Saleh Nikooroo , Thomas Engel

While deep learning has revolutionized research and applications in NLP and computer vision, this has not yet been the case for behavioral modeling and behavioral health applications. This is because the domain's datasets are smaller, have…

Machine Learning · Computer Science 2021-07-14 Mike A. Merrill , Tim Althoff

For sequence models with large vocabularies, a majority of network parameters lie in the input and output layers. In this work, we describe a new method, DeFINE, for learning deep token representations efficiently. Our architecture uses a…

Computation and Language · Computer Science 2020-02-07 Sachin Mehta , Rik Koncel-Kedziorski , Mohammad Rastegari , Hannaneh Hajishirzi

Many recent efforts have been devoted to designing sophisticated deep learning structures, obtaining revolutionary results on benchmark datasets. The success of these deep learning methods mostly relies on an enormous volume of labeled…

Computer Vision and Pattern Recognition · Computer Science 2015-10-20 Jiaji Huang , Qiang Qiu , Robert Calderbank , Guillermo Sapiro

In this paper we propose the Structured Deep Neural Network (structured DNN) as a structured and deep learning framework. This approach can learn to find the best structured object (such as a label sequence) given a structured input (such…

Computation and Language · Computer Science 2015-11-10 Yi-Hsiu Liao , Hung-yi Lee , Lin-shan Lee

Many problems in real-world applications involve predicting several random variables which are statistically related. Markov random fields (MRFs) are a great mathematical tool to encode such relationships. The goal of this paper is to…

Machine Learning · Computer Science 2015-04-29 Liang-Chieh Chen , Alexander G. Schwing , Alan L. Yuille , Raquel Urtasun

Our goal is to provide a review of deep learning methods which provide insight into structured high-dimensional data. Rather than using shallow additive architectures common to most statistical models, deep learning uses layers of…

Machine Learning · Statistics 2023-10-11 Nick Polson , Vadim Sokolov

Capturing contextual dependencies has proven useful to improve the representational power of deep neural networks. Recent approaches that focus on modeling global context, such as self-attention and non-local operation, achieve this goal by…

Computer Vision and Pattern Recognition · Computer Science 2021-06-15 Shenao Zhang , Li Shen , Zhifeng Li , Wei Liu

Despite the success of deep neural networks (DNNs), state-of-the-art models are too large to deploy on low-resource devices or common server configurations in which multiple models are held in memory. Model compression methods address this…

The objective of this work is to explore how to effectively and efficiently adapt pre-trained visual foundation models to various downstream tasks of semantic segmentation. Previous methods usually fine-tuned the entire networks for each…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Lingbo Liu , Jianlong Chang , Bruce X. B. Yu , Liang Lin , Qi Tian , Chang-Wen Chen

In this work, we present Slimmable Neural Networks applied to the problem of small-footprint keyword spotting. We show that slimmable neural networks allow us to create super-nets from Convolutioanl Neural Networks and Transformers, from…

Sound · Computer Science 2023-04-25 Zuhaib Akhtar , Mohammad Omar Khursheed , Dongsu Du , Yuzong Liu

With the rapid development of online advertising and recommendation systems, click-through rate prediction is expected to play an increasingly important role.Recently many DNN-based models which follow a similar Embedding&MLP paradigm have…

Machine Learning · Statistics 2019-05-01 Chenglei Niu , Guojing Zhong , Ying Liu , Yandong Zhang , Yongsheng Sun , Ailong He , Zhaoji Chen
‹ Prev 1 2 3 10 Next ›