Related papers: Structured Transforms for Small-Footprint Deep Lea…

Learning Compressed Transforms with Low Displacement Rank

The low displacement rank (LDR) framework for structured matrices represents a matrix through two displacement operators and a low-rank residual. Existing use of LDR matrices in deep learning has applied fixed displacement operators…

Machine Learning · Computer Science 2019-01-03 Anna T. Thomas , Albert Gu , Tri Dao , Atri Rudra , Christopher Ré

Low-Rank Constraints for Fast Inference in Structured Models

Structured distributions, i.e. distributions over combinatorial spaces, are commonly used to learn latent probabilistic representations from observed data. However, scaling these models is bottlenecked by the high computational and memory…

Computation and Language · Computer Science 2022-01-11 Justin T. Chiu , Yuntian Deng , Alexander M. Rush

Group and Shuffle: Efficient Structured Orthogonal Parametrization

The increasing size of neural networks has led to a growing demand for methods of efficient fine-tuning. Recently, an orthogonal fine-tuning paradigm was introduced that uses orthogonal matrices for adapting the weights of a pretrained…

Machine Learning · Computer Science 2024-06-17 Mikhail Gorbunov , Nikolay Yudin , Vera Soboleva , Aibek Alanov , Alexey Naumov , Maxim Rakhuba

Learning Robust and Lightweight Model through Separable Structured Transformations

With the proliferation of mobile devices and the Internet of Things, deep learning models are increasingly deployed on devices with limited computing resources and memory, and are exposed to the threat of adversarial noise. Learning deep…

Computer Vision and Pattern Recognition · Computer Science 2021-12-30 Xian Wei , Yanhui Huang , Yangyu Xu , Mingsong Chen , Hai Lan , Yuanxiang Li , Zhongfeng Wang , Xuan Tang

Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis

State-of-the-art LLMs often rely on scale with high computational costs, which has sparked a research agenda to reduce parameter counts and costs without significantly impacting performance. Our study focuses on Transformer-based LLMs,…

Computation and Language · Computer Science 2024-07-25 Xiuying Wei , Skander Moalla , Razvan Pascanu , Caglar Gulcehre

Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers

State-of-the-art results in large language models (LLMs) often rely on scale, which becomes computationally expensive. This has sparked a research agenda to reduce these models' parameter counts and computational costs without significantly…

Computation and Language · Computer Science 2024-11-07 Xiuying Wei , Skander Moalla , Razvan Pascanu , Caglar Gulcehre

Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning

Recent efforts to scale Transformer models have demonstrated rapid progress across a wide range of tasks (Wei et al., 2022). However, fine-tuning these models for downstream tasks is expensive due to their large parameter counts.…

Machine Learning · Computer Science 2024-12-19 Arijit Sehanobish , Avinava Dubey , Krzysztof Choromanski , Somnath Basu Roy Chowdhury , Deepali Jain , Vikas Sindhwani , Snigdha Chaturvedi

Graph-Structured Deep Learning Framework for Multi-task Contention Identification with High-dimensional Metrics

This study addresses the challenge of accurately identifying multi-task contention types in high-dimensional system environments and proposes a unified contention classification framework that integrates representation transformation,…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-29 Xiao Yang , Yinan Ni , Yuqi Tang , Zhimin Qiu , Chen Wang , Tingzhou Yuan

Structured Transformations for Stable and Interpretable Neural Computation

Despite their impressive performance, contemporary neural networks often lack structural safeguards that promote stable learning and interpretable behavior. In this work, we introduce a reformulation of layer-level transformations that…

Machine Learning · Computer Science 2025-08-04 Saleh Nikooroo , Thomas Engel

Transformer-Based Behavioral Representation Learning Enables Transfer Learning for Mobile Sensing in Small Datasets

While deep learning has revolutionized research and applications in NLP and computer vision, this has not yet been the case for behavioral modeling and behavioral health applications. This is because the domain's datasets are smaller, have…

Machine Learning · Computer Science 2021-07-14 Mike A. Merrill , Tim Althoff

DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling

For sequence models with large vocabularies, a majority of network parameters lie in the input and output layers. In this work, we describe a new method, DeFINE, for learning deep token representations efficiently. Our architecture uses a…

Computation and Language · Computer Science 2020-02-07 Sachin Mehta , Rik Koncel-Kedziorski , Mohammad Rastegari , Hannaneh Hajishirzi

Geometry-aware Deep Transform

Many recent efforts have been devoted to designing sophisticated deep learning structures, obtaining revolutionary results on benchmark datasets. The success of these deep learning methods mostly relies on an enormous volume of labeled…

Computer Vision and Pattern Recognition · Computer Science 2015-10-20 Jiaji Huang , Qiang Qiu , Robert Calderbank , Guillermo Sapiro

Towards Structured Deep Neural Network for Automatic Speech Recognition

In this paper we propose the Structured Deep Neural Network (structured DNN) as a structured and deep learning framework. This approach can learn to find the best structured object (such as a label sequence) given a structured input (such…

Computation and Language · Computer Science 2015-11-10 Yi-Hsiu Liao , Hung-yi Lee , Lin-shan Lee

Learning Deep Structured Models

Many problems in real-world applications involve predicting several random variables which are statistically related. Markov random fields (MRFs) are a great mathematical tool to encode such relationships. The goal of this paper is to…

Machine Learning · Computer Science 2015-04-29 Liang-Chieh Chen , Alexander G. Schwing , Alan L. Yuille , Raquel Urtasun

Deep Learning: A Tutorial

Our goal is to provide a review of deep learning methods which provide insight into structured high-dimensional data. Rather than using shallow additive architectures common to most statistical models, deep learning uses layers of…

Machine Learning · Statistics 2023-10-11 Nick Polson , Vadim Sokolov

Structure-Regularized Attention for Deformable Object Representation

Capturing contextual dependencies has proven useful to improve the representational power of deep neural networks. Recent approaches that focus on modeling global context, such as self-attention and non-local operation, achieve this goal by…

Computer Vision and Pattern Recognition · Computer Science 2021-06-15 Shenao Zhang , Li Shen , Zhifeng Li , Wei Liu

Structured Multi-Hashing for Model Compression

Despite the success of deep neural networks (DNNs), state-of-the-art models are too large to deploy on low-resource devices or common server configurations in which multiple models are held in memory. Model compression methods address this…

Machine Learning · Computer Science 2019-11-27 Elad Eban , Yair Movshovitz-Attias , Hao Wu , Mark Sandler , Andrew Poon , Yerlan Idelbayev , Miguel A. Carreira-Perpinan

Prompt-Matched Semantic Segmentation

The objective of this work is to explore how to effectively and efficiently adapt pre-trained visual foundation models to various downstream tasks of semantic segmentation. Previous methods usually fine-tuned the entire networks for each…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Lingbo Liu , Jianlong Chang , Bruce X. B. Yu , Liang Lin , Qi Tian , Chang-Wen Chen

Small-footprint slimmable networks for keyword spotting

In this work, we present Slimmable Neural Networks applied to the problem of small-footprint keyword spotting. We show that slimmable neural networks allow us to create super-nets from Convolutioanl Neural Networks and Transformers, from…

Sound · Computer Science 2023-04-25 Zuhaib Akhtar , Mohammad Omar Khursheed , Dongsu Du , Yuzong Liu

Structured Semantic Model supported Deep Neural Network for Click-Through Rate Prediction

With the rapid development of online advertising and recommendation systems, click-through rate prediction is expected to play an increasingly important role.Recently many DNN-based models which follow a similar Embedding&MLP paradigm have…

Machine Learning · Statistics 2019-05-01 Chenglei Niu , Guojing Zhong , Ying Liu , Yandong Zhang , Yongsheng Sun , Ailong He , Zhaoji Chen