Related papers: Regularized Residual Quantization: a multi-layer s…
A learning-based framework for representation of domain-specific images is proposed where joint compression and denoising can be done using a VQ-based multi-layer network. While it learns to compress the images from a training set, the…
The rapid progress of Large Language Models (LLMs) has brought substantial computational and memory demands, spurring the adoption of low-bit quantization. While 8-bit and 4-bit formats have become prevalent, extending quantization to 2…
Product Quantization (PQ) has long been a mainstream for generating an exponentially large codebook at very low memory/time cost. Despite its success, PQ is still tricky for the decomposition of high-dimensional vector space, and the…
Existing neural networks are memory-consuming and computationally intensive, making deploying them challenging in resource-constrained environments. However, there are various methods to improve their efficiency. Two such methods are…
This paper seeks to combine dictionary learning and hierarchical image representation in a principled way. To make dictionary atoms capturing additional information from extended receptive fields and attain improved descriptive capacity, we…
Vector quantization is a fundamental operation for data compression and vector search. To obtain high accuracy, multi-codebook methods represent each vector using codewords across several codebooks. Residual quantization (RQ) is one such…
In this article, we propose a novel regularization method for a class of nonlinear inverse problems that is inspired by an application in quantitative magnetic resonance imaging (qMRI). The latter is a special instance of a general…
In reinforcement learning (RL), Q-learning is a fundamental algorithm whose convergence is guaranteed in the tabular setting. However, this convergence guarantee does not hold under linear function approximation. To overcome this…
Continuous value prediction plays a crucial role in industrial-scale recommendation systems, including tasks such as predicting users' watch-time and estimating the gross merchandise value (GMV) in e-commerce transactions. However, it…
This paper accelerates video perception, such as semantic segmentation and human pose estimation, by levering cross-frame redundancies. Unlike the existing approaches, which avoid redundant computations by warping the past features using…
The success of product quantization (PQ) for fast nearest neighbor search depends on the exponentially reduced complexities of both storage and computation with respect to the codebook size. Recent efforts have been focused on employing…
Quantizing images into discrete representations has been a fundamental problem in unified generative modeling. Predominant approaches learn the discrete representation either in a deterministic manner by selecting the best-matching token or…
Finite Scalar Quantization (FSQ) offers simplified training but suffers from residual magnitude decay in multi-stage settings, where subsequent stages receive exponentially weaker signals. We propose Robust Residual Finite Scalar…
Rotation-based Post-Training Quantization (PTQ) has emerged as a promising solution for mitigating activation outliers in the quantization of Large Language Models (LLMs). Global rotation methods achieve inference efficiency by fusing…
Quantization has been an effective technology in ANN (approximate nearest neighbour) search due to its high accuracy and fast search speed. To meet the requirement of different applications, there is always a trade-off between retrieval…
Recently the sparse representation based classification (SRC) has been proposed for robust face recognition (FR). In SRC, the testing image is coded as a sparse linear combination of the training samples, and the representation fidelity is…
Quantization methods have been introduced to perform large scale approximate nearest search tasks. Residual Vector Quantization (RVQ) is one of the effective quantization methods. RVQ uses a multi-stage codebook learning scheme to lower the…
Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector…
Recent research in the domain of multimodal unified representations predominantly employs codebook as representation forms, utilizing Vector Quantization(VQ) for quantization, yet there has been insufficient exploration of other…
Operating deep neural networks on devices with limited resources requires the reduction of their memory footprints and computational requirements. In this paper we introduce a training method, called look-up table quantization, LUT-Q, which…