English
Related papers

Related papers: RATQ: A Universal Fixed-Length Quantizer for Stoch…

200 papers

Learning discrete representations with vector quantization (VQ) has emerged as a powerful approach in various generative models. However, most VQ-based models rely on a single, fixed-rate codebook, requiring extensive retraining for new…

Machine Learning · Computer Science 2025-02-03 Jiwan Seo , Joonhyuk Kang

A universal fault-tolerant quantum computer holds the promise to speed up computational problems that are otherwise intractable on classical computers; however, for the next decade or so, our access is restricted to noisy intermediate-scale…

Quantum Physics · Physics 2023-02-22 Archismita Dalal , Amara Katabarwa

Data quantization learns encoding results of data with certain requirements, and provides a broad perspective of many real-world applications to data handling. Nevertheless, the results of encoder is usually limited to multivariate inputs…

Machine Learning · Computer Science 2017-05-29 Miao Cheng , Ah Chung Tsoi

Quantum optimization, a key application of quantum computing, has traditionally been stymied by the linearly increasing complexity of gradient calculations with an increasing number of parameters. This work bridges the gap between Koopman…

Quantum Physics · Physics 2024-05-07 Di Luo , Jiayu Shen , Rumen Dangovski , Marin Soljačić

The scalable adaptive cubic regularization method ($\mathrm{ARC_{q}K}$: Dussault et al. in Math. Program. Ser. A 207(1-2): 191-225, 2024) has been recently proposed for unconstrained optimization. It has excellent convergence properties,…

Optimization and Control · Mathematics 2026-03-17 Yonggang Pei , Yubing Lin , Shuai Shao , Mauricio Silva Louzeiro , Detong Zhu

Vector quantization (VQ) underpins modern generative and representation models by turning continuous latents into discrete tokens. Yet hard nearest-neighbor assignments are non-differentiable and are typically optimized with heuristic…

Machine Learning · Computer Science 2026-02-03 Haochen You , Heng Zhang , Hongyang He , Yuqi Li , Baojing Liu

Weight quantization effectively reduces memory consumption and enable the deployment of Large Language Models on edge devices, yet existing hardware-friendly methods often rely on uniform quantization, which suffers from poor…

Machine Learning · Computer Science 2026-02-03 Xin Nie , Liang Dong , Haicheng Zhang , Jiawang Xiao , G. Sun

This paper considers the problem of decentralized optimization on compact submanifolds, where a finite sum of smooth (possibly non-convex) local functions is minimized by $n$ agents forming an undirected and connected graph. However, the…

Optimization and Control · Mathematics 2025-06-10 Jun Chen , Lina Liu , Tianyi Zhu , Yong Liu , Guang Dai , Yunliang Jiang , Ivor W. Tsang

Accelerating the convergence of second-order optimization, particularly Newton-type methods, remains a pivotal challenge in algorithmic research. In this paper, we extend previous work on the \textbf{Quadratic Gradient (QG)} and rigorously…

Optimization and Control · Mathematics 2026-04-01 John Chiang

The increasing size and complexity of large language models (LLMs) have raised significant challenges in deployment efficiency, particularly under resource constraints. Post-training quantization (PTQ) has emerged as a practical solution by…

Computation and Language · Computer Science 2026-04-07 Han Liu , Haotian Gao , Changya Li , Feng Zhang , Xiaotong Zhang , Wei Wang , Hong Yu

We introduce GPTAQ, a novel finetuning-free quantization method for compressing large-scale transformer architectures. Unlike the previous GPTQ method, which independently calibrates each layer, we always match the quantized layer's output…

Machine Learning · Computer Science 2025-05-15 Yuhang Li , Ruokai Yin , Donghyun Lee , Shiting Xiao , Priyadarshini Panda

The goal of quantization is to produce a compressed model whose output distribution is as close to the original model's as possible. To do this tractably, most quantization algorithms minimize the immediate activation error of each layer as…

Machine Learning · Computer Science 2025-09-29 Albert Tseng , Zhaofeng Sun , Christopher De Sa

Post-training quantization (PTQ) reduces excessive hardware cost by quantizing full-precision models into lower bit representations on a tiny calibration set, without retraining. Despite the remarkable progress made through recent efforts,…

Machine Learning · Computer Science 2024-12-16 Junrui Xiao , Zhikai Li , Lianwei Yang , Yiduo Mei , Qingyi Gu

In the context of Noisy Intermediate-Scale Quantum (NISQ) computing, parameterized quantum circuits (PQCs) represent a promising paradigm for tackling challenges in quantum sensing, optimal control, optimization, and machine learning on…

Quantum Physics · Physics 2024-08-13 Dantong Li , Dikshant Dulal , Mykhailo Ohorodnikov , Hanrui Wang , Yongshan Ding

Recently, numerous end-to-end optimized image compression neural networks have been developed and proved themselves as leaders in rate-distortion performance. The main strength of these learnt compression methods is in powerful nonlinear…

Image and Video Processing · Electrical Eng. & Systems 2023-04-26 Xi Zhang , Xiaolin Wu

In this work we describe an Adaptive Regularization using Cubics (ARC) method for large-scale nonconvex unconstrained optimization using Limited-memory Quasi-Newton (LQN) matrices. ARC methods are a relatively new family of optimization…

Optimization and Control · Mathematics 2022-04-21 Jarad Forristal , Joshua Griffin , Wenwen Zhou , Seyedalireza Yektamaram

Quantization is a key method for deploying deep neural networks on edge devices with limited memory and computation resources. Recent improvements in Post-Training Quantization (PTQ) methods were achieved by an additional local optimization…

Computer Vision and Pattern Recognition · Computer Science 2024-09-27 Ofir Gordon , Elad Cohen , Hai Victor Habi , Arnon Netzer

Consider a linear quadratic regulator (LQR) problem being solved in a model-free manner using the policy gradient approach. If the gradient of the quadratic cost is being transmitted across a rate-limited channel, both the convergence and…

Optimization and Control · Mathematics 2024-09-20 Lintao Ye , Aritra Mitra , Vijay Gupta

Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency,…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Kuan Wang , Zhijian Liu , Yujun Lin , Ji Lin , Song Han

The univariate dimension reduction (UDR) method stands as a way to estimate the statistical moments of the output that is effective in a large class of uncertainty quantification (UQ) problems. UDR's fundamental strategy is to approximate…

Computational Engineering, Finance, and Science · Computer Science 2024-10-17 Bingran Wang , Nicholas C. Orndorff , Mark Sperry , John T. Hwang
‹ Prev 1 2 3 10 Next ›