Related papers: Accelerating Scientific Computations with Mixed Pr…

Acceleration of multi-component multiple-precision arithmetic with branch-free algorithms and SIMD vectorization

Multiple-precision floating-point branch-free algorithms can significantly accelerate multi-component arithmetic implemented by combining hardware-based binary64 and binary32, particularly for triple- and quadruple-precision computations.…

Mathematical Software · Computer Science 2026-05-08 Tomonori Kouya

Rethinking Floating Point Overheads for Mixed Precision DNN Accelerators

In this paper, we propose a mixed-precision convolution unit architecture which supports different integer and floating point (FP) precisions. The proposed architecture is based on low-bit inner product units and realizes higher precision…

Hardware Architecture · Computer Science 2021-01-29 Hamzah Abdel-Aziz , Ali Shafiee , Jong Hoon Shin , Ardavan Pedram , Joseph H. Hassoun

The Accuracy and Efficiency of Posit Arithmetic

Motivated by the increasing interest in the posit numeric format, in this paper we evaluate the accuracy and efficiency of posit arithmetic in contrast to the traditional IEEE 754 32-bit floating-point (FP32) arithmetic. We first design and…

Hardware Architecture · Computer Science 2021-09-20 Stefan Dan Ciocirlan , Dumitrel Loghin , Lavanya Ramapantulu , Nicolae Tapus , Yong Meng Teo

Revisiting 16-bit Neural Network Training: A Practical Approach for Resource-Limited Learning

With the increasing complexity of machine learning models, managing computational resources like memory and processing power has become a critical concern. Mixed precision techniques, which leverage different numerical precisions during…

Machine Learning · Computer Science 2026-04-20 Juyoung Yun , Sol Choi , Francois Rameau , Byungkon Kang , Zhoulai Fu

Design and accuracy trade-offs in Computational Statistics

Statistical computations are becoming increasingly important. These computations often need to be performed in log-space because probabilities become extremely small due to repeated multiplications. While using logarithms effectively…

Numerical Analysis · Mathematics 2025-09-16 Tiancheng Xu , Alan L. Cox , Scott Rixner

FlexiBit: Fully Flexible Precision Bit-parallel Accelerator Architecture for Arbitrary Mixed Precision AI

Recent research has shown that large language models (LLMs) can utilize low-precision floating point (FP) quantization to deliver high efficiency while maintaining original model accuracy. In particular, recent works have shown the…

Hardware Architecture · Computer Science 2025-06-05 Faraz Tahmasebi , Yian Wang , Benji Y. H. Huang , Hyoukjun Kwon

A Mixed Precision, Multi-GPU Design for Large-scale Top-K Sparse Eigenproblems

Graph analytics techniques based on spectral methods process extremely large sparse matrices with millions or even billions of non-zero values. Behind these algorithms lies the Top-K sparse eigenproblem, the computation of the largest…

Hardware Architecture · Computer Science 2022-01-20 Francesco Sgherzi , Alberto Parravicini , Marco Domenico Santambrogio

A mixed precision semi-Lagrangian algorithm and its performance on accelerators

In this paper we propose a mixed precision algorithm in the context of the semi-Lagrangian discontinuous Galerkin method. The performance of this approach is evaluated on a traditional dual socket workstation as well as on a Xeon Phi and an…

Mathematical Software · Computer Science 2018-08-14 Lukas Einkemmer

Recycled Error Bits: Energy-Efficient Architectural Support for Higher Precision Floating Point

In this work, we provide energy-efficient architectural support for floating point accuracy. Our goal is to provide accuracy that is far greater than that provided by the processor's hardware floating point unit (FPU). Specifically, for…

Hardware Architecture · Computer Science 2013-09-30 Ralph Nathan , Bryan Anthonio , Shih-Lien Lu , Helia Naeimi , Daniel J. Sorin , Xiaobai Sun

Accelerating 128-bit Floating-Point Matrix Multiplication on FPGAs

General Matrix Multiplication (GEMM) is a fundamental operation widely used in scientific computations. Its performance and accuracy significantly impact the performance and accuracy of applications that depend on it. One such application…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-12 Fumiya Kono , Naohito Nakasato , Maho Nakata

Big-PERCIVAL: Exploring the Native Use of 64-Bit Posit Arithmetic in Scientific Computing

The accuracy requirements in many scientific computing workloads result in the use of double-precision floating-point arithmetic in the execution kernels. Nevertheless, emerging real-number representations, such as posit arithmetic, show…

Hardware Architecture · Computer Science 2024-03-15 David Mallasén , Alberto A. Del Barrio , Manuel Prieto-Matias

Fast and Scalable Mixed Precision Euclidean Distance Calculations Using GPU Tensor Cores

Modern GPUs are equipped with tensor cores (TCs) that are commonly used for matrix multiplication in artificial intelligence workloads. However, because they have high computational throughput, they can lead to significant performance gains…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-01 Brian Curless , Michael Gowanlock

Scaling the memory wall using mixed-precision -- HPG-MxP on an exascale machine

Mixed-precision algorithms have been proposed as a way for scientific computing to benefit from some of the gains seen for artificial intelligence (AI) on recent high performance computing (HPC) platforms. A few applications dominated by…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-16 Aditya Kashi , Nicholson Koukpaizan , Hao Lu , Michael Matheson , Sarp Oral , Feiyi Wang

Mixed precision in Graphics Processing Unit

Modern graphics computing units (GPUs) are designed and optimized to perform highly parallel numerical calculations. This parallelism has enabled (and promises) significant advantages, both in terms of energy performance and calculation. In…

Hardware Architecture · Computer Science 2021-10-26 Quentin Gallouédec

DHFP-PE: Dual-Precision Hybrid Floating Point Processing Element for AI Acceleration

The rapid adoption of low-precision arithmetic in artificial intelligence and edge computing has created a strong demand for energy-efficient and flexible floating-point multiply-accumulate (MAC) units. This paper presents a dual-precision…

Hardware Architecture · Computer Science 2026-04-10 Shubham Kumar , Vijay Pratap Sharma , Vaibhav Neema , Santosh Kumar Vishvakarma

Sequential and Parallel Algorithms for the Addition of Big-Integer Numbers

Today's PCs can directly manipulate numbers not longer than 64 bits because the size of the CPU registers and the data-path are limited. Consequently, arithmetic operations such as addition, can only be performed on numbers of that length.…

Data Structures and Algorithms · Computer Science 2012-04-03 Youssef Bassil , Aziz Barbar

Performance and Numerical Aspects of Decompositional Factorizations with FP64 Floating-Point Emulation in INT8

Mixing precisions for performance has been an ongoing trend as the modern hardware accelerators started including new, and mostly lower-precision, data formats. The advantage of using them is a great potential of performance gain and energy…

Numerical Analysis · Mathematics 2025-09-30 Piotr Luszczek , Vijay Gadepally , LaToya Anderson , William Arcand , David Bestor , William Bergeron , Alex Bonn , Daniel J. Burrill , Chansup Byun , Michael Houle , Matthew Hubbell , Hayden Jananthan , Michael Jones , Peter Michaleas , Guillermo Morales , Julia Mullen , Andrew Prout , Albert Reuther , Antonio Rosa , Charles Yee , Jeremy Kepner

Revisiting BFloat16 Training

State-of-the-art generic low-precision training algorithms use a mix of 16-bit and 32-bit precision, creating the folklore that 16-bit hardware compute units alone are not enough to maximize model accuracy. As a result, deep learning…

Machine Learning · Computer Science 2021-03-09 Pedram Zamirai , Jian Zhang , Christopher R. Aberger , Christopher De Sa

Memory Efficient Mixed-Precision Optimizers

Traditional optimization methods rely on the use of single-precision floating point arithmetic, which can be costly in terms of memory size and computing power. However, mixed precision optimization techniques leverage the use of both…

Machine Learning · Computer Science 2023-09-25 Basile Lewandowski , Atli Kosson

MPCR: Multi-Precision Computations Package in R

In the early days of computing, severe memory constraints made it necessary to use lower floating-point precision. As hardware capabilities have advanced, modern systems, particularly in computational statistics and scientific computing,…

Computation · Statistics 2026-03-03 Mary Lai O. Salvana , Sameh Abdulah , Minwoo Kim , David Helmy , Ying Sun , Marc G. Genton