Related papers: RATQ: A Universal Fixed-Length Quantizer for Stoch…

Rate-Adaptive Quantization: A Multi-Rate Codebook Adaptation for Vector Quantization-based Generative Models

Learning discrete representations with vector quantization (VQ) has emerged as a powerful approach in various generative models. However, most VQ-based models rely on a single, fixed-rate codebook, requiring extensive retraining for new…

Machine Learning · Computer Science 2025-02-03 Jiwan Seo , Joonhyuk Kang

Noise tailoring for Robust Amplitude Estimation

A universal fault-tolerant quantum computer holds the promise to speed up computational problems that are otherwise intractable on classical computers; however, for the next decade or so, our access is restricted to noisy intermediate-scale…

Quantum Physics · Physics 2023-02-22 Archismita Dalal , Amara Katabarwa

Adaptive Training of Random Mapping for Data Quantization

Data quantization learns encoding results of data with certain requirements, and provides a broad perspective of many real-world applications to data handling. Nevertheless, the results of encoder is usually limited to multivariate inputs…

Machine Learning · Computer Science 2017-05-29 Miao Cheng , Ah Chung Tsoi

QuACK: Accelerating Gradient-Based Quantum Optimization with Koopman Operator Learning

Quantum optimization, a key application of quantum computing, has traditionally been stymied by the linearly increasing complexity of gradient calculations with an increasing number of parameters. This work bridges the gap between Koopman…

Quantum Physics · Physics 2024-05-07 Di Luo , Jiayu Shen , Rumen Dangovski , Marin Soljačić

A scalable sequential adaptive cubic regularization algorithm for optimization with general equality constraints

The scalable adaptive cubic regularization method ($\mathrm{ARC_{q}K}$: Dussault et al. in Math. Program. Ser. A 207(1-2): 191-225, 2024) has been recently proposed for unconstrained optimization. It has excellent convergence properties,…

Optimization and Control · Mathematics 2026-03-17 Yonggang Pei , Yubing Lin , Shuai Shao , Mauricio Silva Louzeiro , Detong Zhu

Generalized Radius and Integrated Codebook Transforms for Differentiable Vector Quantization

Vector quantization (VQ) underpins modern generative and representation models by turning continuous latents into discrete tokens. Yet hard nearest-neighbor assignments are non-differentiable and are typically optimized with heuristic…

Machine Learning · Computer Science 2026-02-03 Haochen You , Heng Zhang , Hongyang He , Yuqi Li , Baojing Liu

ELUTQ: Optimizing Quantization Accuracy under LUT-Based Computation for Edge LLMs

Weight quantization effectively reduces memory consumption and enable the deployment of Large Language Models on edge devices, yet existing hardware-friendly methods often rely on uniform quantization, which suffers from poor…

Machine Learning · Computer Science 2026-02-03 Xin Nie , Liang Dong , Haicheng Zhang , Jiawang Xiao , G. Sun

Decentralized Optimization on Compact Submanifolds by Quantized Riemannian Gradient Tracking

This paper considers the problem of decentralized optimization on compact submanifolds, where a finite sum of smooth (possibly non-convex) local functions is minimized by $n$ agents forming an undirected and connected graph. However, the…

Optimization and Control · Mathematics 2025-06-10 Jun Chen , Lina Liu , Tianyi Zhu , Yong Liu , Guang Dai , Yunliang Jiang , Ivor W. Tsang

Quadratic Gradient: A Unified Framework Bridging Gradient Descent and Newton-Type Methods by Synthesizing Hessians and Gradients

Accelerating the convergence of second-order optimization, particularly Newton-type methods, remains a pivotal challenge in algorithmic research. In this paper, we extend previous work on the \textbf{Quadratic Gradient (QG)} and rigorously…

Optimization and Control · Mathematics 2026-04-01 John Chiang

RUQuant: Towards Refining Uniform Quantization for Large Language Models

The increasing size and complexity of large language models (LLMs) have raised significant challenges in deployment efficiency, particularly under resource constraints. Post-training quantization (PTQ) has emerged as a practical solution by…

Computation and Language · Computer Science 2026-04-07 Han Liu , Haotian Gao , Changya Li , Feng Zhang , Xiaotong Zhang , Wei Wang , Hong Yu

GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration

We introduce GPTAQ, a novel finetuning-free quantization method for compressing large-scale transformer architectures. Unlike the previous GPTQ method, which independently calibrates each layer, we always match the quantized layer's output…

Machine Learning · Computer Science 2025-05-15 Yuhang Li , Ruokai Yin , Donghyun Lee , Shiting Xiao , Priyadarshini Panda

Model-Preserving Adaptive Rounding

The goal of quantization is to produce a compressed model whose output distribution is as close to the original model's as possible. To do this tractably, most quantization algorithms minimize the immediate activation error of each layer as…

Machine Learning · Computer Science 2025-09-29 Albert Tseng , Zhaofeng Sun , Christopher De Sa

TTAQ: Towards Stable Post-training Quantization in Continuous Domain Adaptation

Post-training quantization (PTQ) reduces excessive hardware cost by quantizing full-precision models into lower bit representations on a tiny calibration set, without retraining. Despite the remarkable progress made through recent efforts,…

Machine Learning · Computer Science 2024-12-16 Junrui Xiao , Zhikai Li , Lianwei Yang , Yiduo Mei , Qingyi Gu

Efficient Quantum Gradient and Higher-order Derivative Estimation via Generalized Hadamard Test

In the context of Noisy Intermediate-Scale Quantum (NISQ) computing, parameterized quantum circuits (PQCs) represent a promising paradigm for tackling challenges in quantum sensing, optimal control, optimization, and machine learning on…

Quantum Physics · Physics 2024-08-13 Dantong Li , Dikshant Dulal , Mykhailo Ohorodnikov , Hanrui Wang , Yongshan Ding

LVQAC: Lattice Vector Quantization Coupled with Spatially Adaptive Companding for Efficient Learned Image Compression

Recently, numerous end-to-end optimized image compression neural networks have been developed and proved themselves as leaders in rate-distortion performance. The main strength of these learnt compression methods is in powerful nonlinear…

Image and Video Processing · Electrical Eng. & Systems 2023-04-26 Xi Zhang , Xiaolin Wu

A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization

In this work we describe an Adaptive Regularization using Cubics (ARC) method for large-scale nonconvex unconstrained optimization using Limited-memory Quasi-Newton (LQN) matrices. ARC methods are a relatively new family of optimization…

Optimization and Control · Mathematics 2022-04-21 Jarad Forristal , Joshua Griffin , Wenwen Zhou , Seyedalireza Yektamaram

EPTQ: Enhanced Post-Training Quantization via Hessian-guided Network-wise Optimization

Quantization is a key method for deploying deep neural networks on edge devices with limited memory and computation resources. Recent improvements in Post-Training Quantization (PTQ) methods were achieved by an additional local optimization…

Computer Vision and Pattern Recognition · Computer Science 2024-09-27 Ofir Gordon , Elad Cohen , Hai Victor Habi , Arnon Netzer

Model-Free Learning for the Linear Quadratic Regulator over Rate-Limited Channels

Consider a linear quadratic regulator (LQR) problem being solved in a model-free manner using the policy gradient approach. If the gradient of the quadratic cost is being transmitted across a rate-limited channel, both the convergence and…

Optimization and Control · Mathematics 2024-09-20 Lintao Ye , Aritra Mitra , Vijay Gupta

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency,…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Kuan Wang , Zhijian Liu , Yujun Lin , Ji Lin , Song Han

A gradient-enhanced univariate dimension reduction method for uncertainty propagation

The univariate dimension reduction (UDR) method stands as a way to estimate the statistical moments of the output that is effective in a large class of uncertainty quantification (UQ) problems. UDR's fundamental strategy is to approximate…

Computational Engineering, Finance, and Science · Computer Science 2024-10-17 Bingran Wang , Nicholas C. Orndorff , Mark Sperry , John T. Hwang