English
Related papers

Related papers: Laconic Deep Learning Computing

200 papers

We explore techniques to significantly improve the compute efficiency and performance of Deep Convolution Networks without impacting their accuracy. To improve the compute efficiency, we focus on achieving high accuracy with extremely…

Machine Learning · Computer Science 2016-10-04 Ganesh Venkatesh , Eriko Nurvitadhi , Debbie Marr

Spatial-wise dynamic convolution has become a promising approach to improving the inference efficiency of deep networks. By allocating more computation to the most informative pixels, such an adaptive inference paradigm reduces the spatial…

Computer Vision and Pattern Recognition · Computer Science 2022-10-13 Yizeng Han , Zhihang Yuan , Yifan Pu , Chenhao Xue , Shiji Song , Guangyu Sun , Gao Huang

Sparse neural networks can greatly facilitate the deployment of neural networks on resource-constrained platforms as they offer compact model sizes while retaining inference accuracy. Because of the sparsity in parameter matrices, sparse…

Machine Learning · Computer Science 2021-09-10 Febin Sunny , Mahdi Nikdast , Sudeep Pasricha

Decomposition is a proven way to shrink deep networks without changing input-output dimensionality or interface semantics. We bring this idea to hyperdimensional computing (HDC), where footprint cuts usually shrink the feature axis and…

Machine Learning · Computer Science 2026-02-04 Sanggeon Yun , Hyunwoo Oh , Ryozo Masukawa , Mohsen Imani

Load balancers are pervasively used inside today's clouds to scalably distribute network requests across data center servers. Given the extensive use of load balancers and their associated operating costs, several efforts have focused on…

Networking and Internet Architecture · Computer Science 2024-03-19 Tianyi Cui , Chenxingyu Zhao , Wei Zhang , Kaiyuan Zhang , Arvind Krishnamurthy

While dense retrieval models have become the standard for state-of-the-art information retrieval, their deployment is often constrained by high memory requirements and reliance on GPU accelerators for vector similarity search. Learned…

Information Retrieval · Computer Science 2026-01-06 Zhichao Xu , Shengyao Zhuang , Crystina Zhang , Xueguang Ma , Yijun Tian , Maitrey Mehta , Jimmy Lin , Vivek Srikumar

Optical and hybrid convolutional neural networks (CNNs) recently have become of increasing interest to achieve low-latency, low-power image classification and computer vision tasks. However, implementing optical nonlinearity is challenging,…

Computer Vision and Pattern Recognition · Computer Science 2024-06-17 Anna Wirth-Singh , Jinlin Xiang , Minho Choi , Johannes E. Fröch , Luocheng Huang , Shane Colburn , Eli Shlizerman , Arka Majumdar

In many learning situations, resources at inference time are significantly more constrained than resources at training time. This paper studies a general paradigm, called Differentiable ARchitecture Compression (DARC), that combines model…

Machine Learning · Computer Science 2019-05-21 Shashank Singh , Ashish Khetan , Zohar Karnin

Porting state of the art deep learning algorithms to resource constrained compute platforms (e.g. VR, AR, wearables) is extremely challenging. We propose a fast, compact, and accurate model for convolutional neural networks that enables…

Computer Vision and Pattern Recognition · Computer Science 2017-06-14 Hessam Bagherinezhad , Mohammad Rastegari , Ali Farhadi

Recent trends show recognition accuracy increasing even more profoundly. Inference process of Deep Convolutional Neural Networks (DCNN) has a large number of parameters, requires a large amount of computation, and can be very slow. The…

Computer Vision and Pattern Recognition · Computer Science 2017-09-15 Ryuji Kamiya , Takayoshi Yamashita , Mitsuru Ambai , Ikuro Sato , Yuji Yamauchi , Hironobu Fujiyoshi

As state of the art neural networks (NNs) continue to grow in size, their resource-efficient implementation becomes ever more important. In this paper, we introduce a compression scheme that reduces the number of computations required for…

Machine Learning · Computer Science 2025-04-25 Hans Rosenberger , Rodrigo Fischer , Johanna S. Fröhlich , Ali Bereyhi , Ralf R. Müller

Deep learning is highly pervasive in today's data-intensive era. In particular, convolutional neural networks (CNNs) are being widely adopted in a variety of fields for superior accuracy. However, computing deep CNNs on traditional CPUs and…

Emerging Technologies · Computer Science 2022-06-29 Dharanidhar Dang , Bill Lin , Debashis Sahoo

Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks. Independently, neuromorphic computing has now demonstrated unprecedented energy-efficiency through a new chip architecture based on…

The computation and memory costs of large language models kept increasing over last decade, which reached over the scale of 1T parameters. To address the challenges from the large scale models, model compression techniques such as low-rank…

Hardware Architecture · Computer Science 2025-10-16 Faraz Tahmasebi , Michael Pelluer , Hyoukjun Kwon

Transformer-based models are becoming more and more intelligent and are revolutionizing a wide range of human tasks. To support their deployment, AI labs offer inference services that consume hundreds of GWh of energy annually and charge…

Systems and Control · Electrical Eng. & Systems 2025-08-29 Ching-Yi Lin , Sahil Shah

We propose a Digital Neuron, a hardware inference accelerator for convolutional deep neural networks with integer inputs and integer weights for embedded systems. The main idea to reduce circuit area and power consumption is manipulating…

Signal Processing · Electrical Eng. & Systems 2019-02-08 Hyunbin Park , Dohyun Kim , Shiho Kim

Deep convolution Neural Network (DCNN) has been widely used in computer vision tasks. However, for edge devices even inference has too large computational complexity and data access amount. The inference latency of state-of-the-art models…

Hardware Architecture · Computer Science 2025-09-09 Kuan-Ting Lin , Ching-Te Chiu , Jheng-Yi Chang , Shi-Zong Huang , Yu-Ting Li

Inference efficiency is the predominant consideration in designing deep learning accelerators. Previous work mainly focuses on skipping zero values to deal with remarkable ineffectual computation, while zero bits in non-zero values, as…

Machine Learning · Computer Science 2018-11-19 Hang Lu , Xin Wei , Ning Lin , Guihai Yan , and Xiaowei Li

Deploying a deep learning model on mobile/IoT devices is a challenging task. The difficulty lies in the trade-off between computation speed and accuracy. A complex deep learning model with high accuracy runs slowly on resource-limited…

Computer Vision and Pattern Recognition · Computer Science 2018-12-31 Xin Li , Shuai Zhang , Bolan Jiang , Yingyong Qi , Mooi Choo Chuah , Ning Bi

Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to…

Emerging Technologies · Computer Science 2019-05-21 Ryan Hamerly , Liane Bernstein , Alexander Sludds , Marin Soljačić , Dirk Englund
‹ Prev 1 2 3 10 Next ›