Related papers: Automatic Quantization for Physics-Based Simulatio…
Quantization is essential for reducing the computational cost and memory usage of deep neural networks, enabling efficient inference on low-precision hardware. Despite the growing adoption of uniform and floating-point quantization schemes,…
Quantum simulation, the simulation of quantum processes on quantum computers, suggests a path forward for the efficient simulation of problems in condensed-matter physics, quantum chemistry, and materials science. While the majority of…
In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values. The proposed…
Quantization-aware training (QAT) is a leading technique for improving the accuracy of quantized neural networks. Previous work has shown that decomposing training into a full-precision (FP) phase followed by a QAT phase yields superior…
Quantization is a popular technique that $transforms$ the parameter representation of a neural network from floating-point numbers into lower-precision ones ($e.g.$, 8-bit integers). It reduces the memory footprint and the computational…
In this paper, we examine the optimal quantization of signals for system identification. We deal with memoryless quantization for the output signals and derive the optimal quantization schemes. The objective functions are the errors of…
Quantum phase estimation is one of the key algorithms in the field of quantum computing, but up until now, only approximate expressions have been derived for the probability of error. We revisit these derivations, and find that by ensuring…
Quantization is an effective way to reduce the memory cost of large-scale model training. However, most existing methods adopt fixed-precision policies, which ignore the fact that optimizer-state distributions vary significantly across…
A reliable method for characterizing quantum operations that is suitable for improving and validating their accuracies is indispensable for realizing a practical quantum computer. Known methods are still not sufficient because they lack…
We consider the problem of solving a distributed optimization problem using a distributed computing platform, where the communication in the network is limited: each node can only communicate with its neighbours and the channel has a…
In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes with high computational costs and large…
In this paper, we introduce a quantum-enhanced algorithm for simulation-based optimization. Simulation-based optimization seeks to optimize an objective function that is computationally expensive to evaluate exactly, and thus, is…
Simulating the stochastic evolution of real quantities on a digital computer requires a trade-off between the precision to which these quantities are approximated, and the memory required to store them. The statistical accuracy of the…
Diffusion models have gained popularity for generating images from textual descriptions. Nonetheless, the substantial need for computational resources continues to present a noteworthy challenge, contributing to time-consuming processes.…
Quantization is a fundamental optimization for many machine-learning use cases, including compressing gradients, model weights and activations, and datasets. The most accurate form of quantization is \emph{adaptive}, where the error is…
Quantization is a widely used compression method that effectively reduces redundancies in over-parameterized neural networks. However, existing quantization techniques for deep neural networks often lack a comprehensive error analysis due…
Variational algorithms may enable classically intractable simulations on near-future quantum computers. However, their potential is limited by hardware errors. It is therefore crucial to develop efficient ways to mitigate these errors.…
Approximation errors must be taken into account when compiling quantum programs into a low-level gate set. We present a methodology that tracks such errors automatically and then optimizes accuracy parameters to guarantee a specified…
We introduce a quantization-aware training algorithm that guarantees avoiding numerical overflow when reducing the precision of accumulators during inference. We leverage weight normalization as a means of constraining parameters during…
Continuous-time stochastic processes pervade everyday experience, and the simulation of models of these processes is of great utility. Classical models of systems operating in continuous-time must typically track an unbounded amount of…