Related papers: Laconic Deep Learning Computing

Accelerating Deep Convolutional Networks using low-precision and sparsity

We explore techniques to significantly improve the compute efficiency and performance of Deep Convolution Networks without impacting their accuracy. To improve the compute efficiency, we focus on achieving high accuracy with extremely…

Machine Learning · Computer Science 2016-10-04 Ganesh Venkatesh , Eriko Nurvitadhi , Debbie Marr

Latency-aware Spatial-wise Dynamic Networks

Spatial-wise dynamic convolution has become a promising approach to improving the inference efficiency of deep networks. By allocating more computation to the most informative pixels, such an adaptive inference paradigm reduces the spatial…

Computer Vision and Pattern Recognition · Computer Science 2022-10-13 Yizeng Han , Zhihang Yuan , Yifan Pu , Chenhao Xue , Shiji Song , Guangyu Sun , Gao Huang

SONIC: A Sparse Neural Network Inference Accelerator with Silicon Photonics for Energy-Efficient Deep Learning

Sparse neural networks can greatly facilitate the deployment of neural networks on resource-constrained platforms as they offer compact model sizes while retaining inference accuracy. Because of the sparsity in parameter matrices, sparse…

Machine Learning · Computer Science 2021-09-10 Febin Sunny , Mahdi Nikdast , Sudeep Pasricha

DecoHD: Decomposed Hyperdimensional Classification under Extreme Memory Budgets

Decomposition is a proven way to shrink deep networks without changing input-output dimensionality or interface semantics. We bring this idea to hyperdimensional computing (HDC), where footprint cuts usually shrink the feature axis and…

Machine Learning · Computer Science 2026-02-04 Sanggeon Yun , Hyunwoo Oh , Ryozo Masukawa , Mohsen Imani

Laconic: Streamlined Load Balancers for SmartNICs

Load balancers are pervasively used inside today's clouds to scalably distribute network requests across data center servers. Given the extensive use of load balancers and their associated operating costs, several efforts have focused on…

Networking and Internet Architecture · Computer Science 2024-03-19 Tianyi Cui , Chenxingyu Zhao , Wei Zhang , Kaiyuan Zhang , Arvind Krishnamurthy

LACONIC: Dense-Level Effectiveness for Scalable Sparse Retrieval via a Two-Phase Training Curriculum

While dense retrieval models have become the standard for state-of-the-art information retrieval, their deployment is often constrained by high memory requirements and reliance on GPU accelerators for vector similarity search. Learned…

Information Retrieval · Computer Science 2026-01-06 Zhichao Xu , Shengyao Zhuang , Crystina Zhang , Xueguang Ma , Yijun Tian , Maitrey Mehta , Jimmy Lin , Vivek Srikumar

Compressed Meta-Optical Encoder for Image Classification

Optical and hybrid convolutional neural networks (CNNs) recently have become of increasing interest to achieve low-latency, low-power image classification and computer vision tasks. However, implementing optical nonlinearity is challenging,…

Computer Vision and Pattern Recognition · Computer Science 2024-06-17 Anna Wirth-Singh , Jinlin Xiang , Minho Choi , Johannes E. Fröch , Luocheng Huang , Shane Colburn , Eli Shlizerman , Arka Majumdar

DARC: Differentiable ARchitecture Compression

In many learning situations, resources at inference time are significantly more constrained than resources at training time. This paper studies a general paradigm, called Differentiable ARchitecture Compression (DARC), that combines model…

Machine Learning · Computer Science 2019-05-21 Shashank Singh , Ashish Khetan , Zohar Karnin

LCNN: Lookup-based Convolutional Neural Network

Porting state of the art deep learning algorithms to resource constrained compute platforms (e.g. VR, AR, wearables) is extremely challenging. We propose a fast, compact, and accurate model for convolutional neural networks that enables…

Computer Vision and Pattern Recognition · Computer Science 2017-06-14 Hessam Bagherinezhad , Mohammad Rastegari , Ali Farhadi

Binary-decomposed DCNN for accelerating computation and compressing model without retraining

Recent trends show recognition accuracy increasing even more profoundly. Inference process of Deep Convolutional Neural Networks (DCNN) has a large number of parameters, requires a large amount of computation, and can be very slow. The…

Computer Vision and Pattern Recognition · Computer Science 2017-09-15 Ryuji Kamiya , Takayoshi Yamashita , Mitsuru Ambai , Ikuro Sato , Yuji Yamauchi , Hironobu Fujiyoshi

Coding for Computation: Efficient Compression of Neural Networks for Reconfigurable Hardware

As state of the art neural networks (NNs) continue to grow in size, their resource-efficient implementation becomes ever more important. In this paper, we introduce a compression scheme that reduces the number of computations required for…

Machine Learning · Computer Science 2025-04-25 Hans Rosenberger , Rodrigo Fischer , Johanna S. Fröhlich , Ali Bereyhi , Ralf R. Müller

LiteCON: An All-Photonic Neuromorphic Accelerator for Energy-efficient Deep Learning (Preprint)

Deep learning is highly pervasive in today's data-intensive era. In particular, convolutional neural networks (CNNs) are being widely adopted in a variety of fields for superior accuracy. However, computing deep CNNs on traditional CPUs and…

Emerging Technologies · Computer Science 2022-06-29 Dharanidhar Dang , Bill Lin , Debashis Sahoo

Convolutional Networks for Fast, Energy-Efficient Neuromorphic Computing

Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks. Independently, neuromorphic computing has now demonstrated unprecedented energy-efficiency through a new chip architecture based on…

Neural and Evolutionary Computing · Computer Science 2016-10-13 Steven K. Esser , Paul A. Merolla , John V. Arthur , Andrew S. Cassidy , Rathinakumar Appuswamy , Alexander Andreopoulos , David J. Berg , Jeffrey L. McKinstry , Timothy Melano , Davis R. Barch , Carmelo di Nolfo , Pallab Datta , Arnon Amir , Brian Taba , Myron D. Flickner , Dharmendra S. Modha

D-com: Accelerating Iterative Processing to Enable Low-rank Decomposition of Activations

The computation and memory costs of large language models kept increasing over last decade, which reached over the scale of 1T parameters. To address the challenges from the large scale models, model compression techniques such as low-rank…

Hardware Architecture · Computer Science 2025-10-16 Faraz Tahmasebi , Michael Pelluer , Hyoukjun Kwon

Systolic Array-based Architecture for Low-Bit Integerized Vision Transformers

Transformer-based models are becoming more and more intelligent and are revolutionizing a wide range of human tasks. To support their deployment, AI labs offer inference services that consume hundreds of GWh of energy annually and charge…

Systems and Control · Electrical Eng. & Systems 2025-08-29 Ching-Yi Lin , Sahil Shah

Digital Neuron: A Hardware Inference Accelerator for Convolutional Deep Neural Networks

We propose a Digital Neuron, a hardware inference accelerator for convolutional deep neural networks with integer inputs and integer weights for embedded systems. The main idea to reduce circuit area and power consumption is manipulating…

Signal Processing · Electrical Eng. & Systems 2019-02-08 Hyunbin Park , Dohyun Kim , Shiho Kim

High Utilization Energy-Aware Real-Time Inference Deep Convolutional Neural Network Accelerator

Deep convolution Neural Network (DCNN) has been widely used in computer vision tasks. However, for edge devices even inference has too large computational complexity and data access amount. The inference latency of state-of-the-art models…

Hardware Architecture · Computer Science 2025-09-09 Kuan-Ting Lin , Ching-Te Chiu , Jheng-Yi Chang , Shi-Zong Huang , Yu-Ting Li

Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators

Inference efficiency is the predominant consideration in designing deep learning accelerators. Previous work mainly focuses on skipping zero values to deal with remarkable ineffectual computation, while zero bits in non-zero values, as…

Machine Learning · Computer Science 2018-11-19 Hang Lu , Xin Wei , Ning Lin , Guihai Yan , and Xiaowei Li

DAC: Data-free Automatic Acceleration of Convolutional Networks

Deploying a deep learning model on mobile/IoT devices is a challenging task. The difficulty lies in the trade-off between computation speed and accuracy. A complex deep learning model with high accuracy runs slowly on resource-limited…

Computer Vision and Pattern Recognition · Computer Science 2018-12-31 Xin Li , Shuai Zhang , Bolan Jiang , Yingyong Qi , Mooi Choo Chuah , Ning Bi

Large-Scale Optical Neural Networks based on Photoelectric Multiplication

Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to…

Emerging Technologies · Computer Science 2019-05-21 Ryan Hamerly , Liane Bernstein , Alexander Sludds , Marin Soljačić , Dirk Englund