English
Related papers

Related papers: Differentiable Product Quantization for End-to-End…

200 papers

Product Quantization (PQ) has long been a mainstream for generating an exponentially large codebook at very low memory/time cost. Despite its success, PQ is still tricky for the decomposition of high-dimensional vector space, and the…

Computer Vision and Pattern Recognition · Computer Science 2020-12-08 Lianli Gao , Xiaosu Zhu , Jingkuan Song , Zhou Zhao , Heng Tao Shen

Product Quantization, a dictionary based hashing method, is one of the leading unsupervised hashing techniques. While it ignores the labels, it harnesses the features to construct look up tables that can approximate the feature space. In…

Computer Vision and Pattern Recognition · Computer Science 2020-01-22 Benjamin Klein , Lior Wolf

We propose DiffQ a differentiable method for model compression for quantizing model parameters without gradient approximations (e.g., Straight Through Estimator). We suggest adding independent pseudo quantization noise to model parameters…

Machine Learning · Statistics 2022-10-18 Alexandre Défossez , Yossi Adi , Gabriel Synnaeve

Continuous representation of words is a standard component in deep learning-based NLP models. However, representing a large vocabulary requires significant memory, which can cause problems, particularly on resource-constrained platforms.…

Computation and Language · Computer Science 2020-01-24 Siyu Liao , Jie Chen , Yanzhi Wang , Qinru Qiu , Bo Yuan

Model quantization is challenging due to many tedious hyper-parameters such as precision (bitwidth), dynamic range (minimum and maximum discrete values) and stepsize (interval between discrete values). Unlike prior arts that carefully tune…

Machine Learning · Computer Science 2021-07-08 Zhang Zhaoyang , Shao Wenqi , Gu Jinwei , Wang Xiaogang , Luo Ping

Thanks to their state-of-the-art performance, deep neural networks are increasingly used for object recognition. To achieve these results, they use millions of parameters to be trained. However, when targeting embedded applications the size…

Machine Learning · Computer Science 2016-03-21 Guillaume Soulié , Vincent Gripon , Maëlys Robert

Large-scale image datasets are fundamental to deep learning, but their high storage demands pose challenges for deployment in resource-constrained environments. While existing approaches reduce dataset size by discarding samples, they often…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Chenyue Yu , Lingao Xiao , Jinhong Deng , Ivor W. Tsang , Yang He

Data-driven decision-making processes increasingly utilize end-to-end learnable deep neural networks to render final decisions. Sometimes, the output of the forward functions in certain layers is determined by the solutions to mathematical…

Machine Learning · Computer Science 2024-12-31 Jianming Pan , Zeqi Ye , Xiao Yang , Xu Yang , Weiqing Liu , Lewen Wang , Jiang Bian

We present a new approach to learn compressible representations in deep architectures with an end-to-end training strategy. Our method is based on a soft (continuous) relaxation of quantization and entropy, which we anneal to their discrete…

Machine Learning · Computer Science 2017-06-09 Eirikur Agustsson , Fabian Mentzer , Michael Tschannen , Lukas Cavigelli , Radu Timofte , Luca Benini , Luc Van Gool

Hardware-friendly network quantization (e.g., binary/uniform quantization) can efficiently accelerate the inference and meanwhile reduce memory consumption of the deep neural networks, which is crucial for model deployment on…

Computer Vision and Pattern Recognition · Computer Science 2019-08-15 Ruihao Gong , Xianglong Liu , Shenghu Jiang , Tianxiang Li , Peng Hu , Jiazhen Lin , Fengwei Yu , Junjie Yan

Quantization is a key method for deploying deep neural networks on edge devices with limited memory and computation resources. Recent improvements in Post-Training Quantization (PTQ) methods were achieved by an additional local optimization…

Computer Vision and Pattern Recognition · Computer Science 2024-09-27 Ofir Gordon , Elad Cohen , Hai Victor Habi , Arnon Netzer

In order to deploy deep models in a computationally efficient manner, model quantization approaches have been frequently used. In addition, as new hardware that supports mixed bitwidth arithmetic operations, recent research on mixed…

Machine Learning · Computer Science 2022-07-12 Xijie Huang , Zhiqiang Shen , Shichao Li , Zechun Liu , Xianghong Hu , Jeffry Wicaksana , Eric Xing , Kwang-Ting Cheng

In this paper, we propose a novel end-to-end feature compression scheme by leveraging the representation and learning capability of deep neural networks, towards intelligent front-end equipped analysis with promising accuracy and…

Computer Vision and Pattern Recognition · Computer Science 2020-02-11 Shurun Wang , Wenhan Yang , Shiqi Wang

Quantization has been applied to multiple domains in Deep Neural Networks (DNNs). We propose Depthwise Quantization (DQ) where $\textit{quantization}$ is applied to a decomposed sub-tensor along the $\textit{feature axis}$ of weak…

Computer Vision and Pattern Recognition · Computer Science 2022-03-30 Iordanis Fostiropoulos , Barry Boehm

Contemporary deep learning, characterized by the training of cumbersome neural networks on massive datasets, confronts substantial computational hurdles. To alleviate heavy data storage burdens on limited hardware resources, numerous…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Muquan Li , Dongyang Zhang , Qiang Dong , Xiurui Xie , Ke Qin

The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-constrained computing devices. Model compression techniques can address…

Machine Learning · Computer Science 2018-10-23 Qing Qin , Jie Ren , Jialong Yu , Ling Gao , Hai Wang , Jie Zheng , Yansong Feng , Jianbin Fang , Zheng Wang

Deep neural networks have achieved state-of-the-art results in a wide range of applications, from natural language processing and computer vision to speech recognition. However, as tasks become increasingly complex, model sizes continue to…

Computer Vision and Pattern Recognition · Computer Science 2025-05-21 Tomer Gafni , Asaf Karnieli , Yair Hanani

Deep Learning Architectures employ heavy computations and bulk of the computational energy is taken up by the convolution operations in the Convolutional Neural Networks. The objective of our proposed work is to reduce the energy…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-17 Salman Abdul Khaliq , Rehan Hafiz

Despite its improvements in coding performance compared to traditional codecs, Learned Image Compression (LIC) suffers from large computational costs for storage and deployment. Model quantization offers an effective solution to reduce the…

Image and Video Processing · Electrical Eng. & Systems 2025-06-03 Md Adnan Faisal Hossain , Zhihao Duan , Fengqing Zhu

With applications ranging from metabolomics to histopathology, quantitative phase microscopy (QPM) is a powerful label-free imaging modality. Despite significant advances in fast multiplexed imaging sensors and deep-learning-based inverse…

‹ Prev 1 2 3 10 Next ›