Related papers: Approximation by Quantization

Improving variational methods via pairwise linear response identities

Inference methods are often formulated as variational approximations: these approximations allow easy evaluation of statistics by marginalization or linear response, but these estimates can be inconsistent. We show that by introducing…

Machine Learning · Statistics 2017-04-27 Jack Raymond , Federico Ricci-Tersenghi

Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines

Deep learning as a means to inferencing has proliferated thanks to its versatility and ability to approach or exceed human-level accuracy. These computational models have seemingly insatiable appetites for computational resources not only…

Machine Learning · Computer Science 2018-05-22 Sean O. Settle , Manasa Bollavaram , Paolo D'Alberto , Elliott Delaye , Oscar Fernandez , Nicholas Fraser , Aaron Ng , Ashish Sirasao , Michael Wu

Accelerating Large-Scale Inference with Anisotropic Vector Quantization

Quantization based techniques are the current state-of-the-art for scaling maximum inner product search to massive databases. Traditional approaches to quantization aim to minimize the reconstruction error of the database points. Based on…

Machine Learning · Computer Science 2020-12-08 Ruiqi Guo , Philip Sun , Erik Lindgren , Quan Geng , David Simcha , Felix Chern , Sanjiv Kumar

Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes

Quantization is a popular technique that $transforms$ the parameter representation of a neural network from floating-point numbers into lower-precision ones ($e.g.$, 8-bit integers). It reduces the memory footprint and the computational…

Machine Learning · Computer Science 2021-11-12 Sanghyun Hong , Michael-Andrei Panaitescu-Liess , Yiğitcan Kaya , Tudor Dumitraş

Accelerating Neural Network Inference by Overflow Aware Quantization

The inherent heavy computation of deep neural networks prevents their widespread applications. A widely used method for accelerating model inference is quantization, by replacing the input operands of a network using fixed-point values.…

Computer Vision and Pattern Recognition · Computer Science 2020-05-28 Hongwei Xie , Shuo Zhang , Huanghao Ding , Yafei Song , Baitao Shao , Conggang Hu , Ling Cai , Mingyang Li

Adaptive Variational Inference in Probabilistic Graphical Models: Beyond Bethe, Tree-Reweighted, and Convex Free Energies

Variational inference in probabilistic graphical models aims to approximate fundamental quantities such as marginal distributions and the partition function. Popular approaches are the Bethe approximation, tree-reweighted, and other types…

Machine Learning · Statistics 2025-02-06 Harald Leisenberger , Franz Pernkopf

Exact and approximate inference in graphical models: variable elimination and beyond

Probabilistic graphical models offer a powerful framework to account for the dependence structure between variables, which is represented as a graph. However, the dependence between variables may render inference tasks intractable. In this…

Machine Learning · Statistics 2018-03-13 Nathalie Peyrard , Marie-Josée Cros , Simon de Givry , Alain Franc , Stéphane Robin , Régis Sabbadin , Thomas Schiex , Matthieu Vignes

Factor graph fragmentization of expectation propagation

Expectation propagation is a general approach to fast approximate inference for graphical models. The existing literature treats models separately when it comes to deriving and coding expectation propagation inference algorithms. This comes…

Methodology · Statistics 2018-01-17 Wilson Y. Chen , Matt P. Wand

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation

Quantization techniques can reduce the size of Deep Neural Networks and improve inference latency and throughput by taking advantage of high throughput integer instructions. In this paper we review the mathematical aspects of quantization…

Machine Learning · Computer Science 2020-04-22 Hao Wu , Patrick Judd , Xiaojie Zhang , Mikhail Isaev , Paulius Micikevicius

A Genetic Algorithm Approach for ImageRepresentation Learning through Color Quantization

Over the last decades, hand-crafted feature extractors have been used to encode image visual properties into feature vectors. Recently, data-driven feature learning approaches have been successfully explored as alternatives for producing…

Computer Vision and Pattern Recognition · Computer Science 2020-11-25 Érico M. Pereira , Ricardo da S. Torres , Jefersson A. dos Santos

Loss Bounds for Approximate Influence-Based Abstraction

Sequential decision making techniques hold great promise to improve the performance of many real-world systems, but computational complexity hampers their principled application. Influence-based abstraction aims to gain leverage by modeling…

Artificial Intelligence · Computer Science 2021-02-24 Elena Congeduti , Alexander Mey , Frans A. Oliehoek

Inference for Multiplicative Models

The paper introduces a generalization for known probabilistic models such as log-linear and graphical models, called here multiplicative models. These models, that express probabilities via product of parameters are shown to capture…

Artificial Intelligence · Computer Science 2012-06-18 Ydo Wexler , Christopher Meek

Improving the Efficiency of Approximate Inference for Probabilistic Logical Models by means of Program Specialization

We consider the task of performing probabilistic inference with probabilistic logical models. Many algorithms for approximate inference with such models are based on sampling. From a logic programming perspective, sampling boils down to…

Artificial Intelligence · Computer Science 2015-03-19 Daan Fierens

Variational Inference with Normalizing Flows

The choice of approximate posterior distribution is one of the core problems in variational inference. Most applications of variational inference employ simple families of posterior approximations in order to allow for efficient inference,…

Machine Learning · Statistics 2016-06-15 Danilo Jimenez Rezende , Shakir Mohamed

Active Inference is a Subtype of Variational Inference

Automated decision-making under uncertainty requires balancing exploitation and exploration. Classical methods treat these separately using heuristics, while Active Inference unifies them through Expected Free Energy (EFE) minimization.…

Artificial Intelligence · Computer Science 2025-11-25 Wouter W. L. Nuijten , Mykola Lukashchuk

Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization

Quantization of neural networks has become common practice, driven by the need for efficient implementations of deep neural networks on embedded devices. In this paper, we exploit an oft-overlooked degree of freedom in most networks - for a…

Machine Learning · Computer Science 2019-02-07 Eldad Meller , Alexander Finkelstein , Uri Almog , Mark Grobman

A Survey of Quantization Methods for Efficient Neural Network Inference

As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related…

Computer Vision and Pattern Recognition · Computer Science 2021-06-23 Amir Gholami , Sehoon Kim , Zhen Dong , Zhewei Yao , Michael W. Mahoney , Kurt Keutzer

A Unified View of Algorithms for Path Planning Using Probabilistic Inference on Factor Graphs

Even if path planning can be solved using standard techniques from dynamic programming and control, the problem can also be approached using probabilistic inference. The algorithms that emerge using the latter framework bear some appealing…

Machine Learning · Computer Science 2021-06-22 Francesco A. N. Palmieri , Krishna R. Pattipati , Giovanni Di Gennaro , Giovanni Fioretti , Francesco Verolla , Amedeo Buonanno

Quantifying and Optimizing Simplicity via Polynomial Representations

Deep networks often exhibit a preference for "simple" solutions, and such a simplicity bias is widely believed to play a key role in generalization. Yet a broadly applicable, quantitative measure of simplicity remains elusive. We introduce…

Artificial Intelligence · Computer Science 2026-05-29 Tianren Zhang , Xiangxin Li , Minghao Xiao , Guanyu Chen , Feng Chen

Decentralized Data Reduction with Quantization Constraints

A guiding principle for data reduction in statistical inference is the sufficiency principle. This paper extends the classical sufficiency principle to decentralized inference, i.e., data reduction needs to be achieved in a decentralized…

Information Theory · Computer Science 2015-06-16 Ge Xu , Shengyu Zhu , Biao Chen