English
Related papers

Related papers: Memory Efficient Mixed-Precision Optimizers

200 papers

Deep neural networks have enabled progress in a wide variety of applications. Growing the size of the neural network typically results in improved accuracy. As model sizes grow, the memory and compute requirements for training these models…

The widespread adoption of machine learning algorithms necessitates hardware acceleration to ensure efficient performance. This acceleration relies on custom matrix engines that operate on full or reduced-precision floating-point…

Hardware Architecture · Computer Science 2024-08-23 Kosmas Alexandridis , Christodoulos Peltekis , Dionysios Filippas , Giorgos Dimitrakopoulos

Standard mixed-precision training of neural networks requires many bytes of accelerator memory for each model parameter. These bytes reflect not just the parameter itself, but also its gradient and one or more optimizer state variables.…

Machine Learning · Computer Science 2026-03-13 Jose Javier Gonzalez Ortiz , Abhay Gupta , Christopher Rinard , Davis Blalock

The growing demands of the worldwide IT infrastructure stress the need for reduced power consumption, which is addressed in so-called transprecision computing by improving energy efficiency at the expense of precision. For example, reducing…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-30 Andrea Borghesi , Giuseppe Tagliavini , Michele Lombardi , Luca Benini , Michela Milano

Iterative solvers are frequently used in scientific applications and engineering computations. However, the memory-bound Sparse Matrix-Vector (SpMV) kernel computation hinders the efficiency of iterative algorithms. As modern hardware…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-11-08 Jianhua Gao , Jiayuan Shen , Yuxiang Zhang , Weixing Ji , Hua Huang

Finite-precision arithmetic computations face an inherent tradeoff between accuracy and efficiency. The points in this tradeoff space are determined, among other factors, by different data types but also evaluation orders. To put it simply,…

Programming Languages · Computer Science 2017-07-10 Eva Darulova , Einar Horn , Saksham Sharma

When training deep neural networks, keeping all tensors in high precision (e.g., 32-bit or even 16-bit floats) is often wasteful. However, keeping all tensors in low precision (e.g., 8-bit floats) can lead to unacceptable accuracy loss.…

Machine Learning · Computer Science 2023-06-26 Wonyeol Lee , Rahul Sharma , Alex Aiken

Large models training is plagued by the intense compute cost and limited hardware memory. A practical solution is low-precision representation but is troubled by loss in numerical accuracy and unstable training rendering the model less…

With the increasing complexity of machine learning models, managing computational resources like memory and processing power has become a critical concern. Mixed precision techniques, which leverage different numerical precisions during…

Machine Learning · Computer Science 2026-04-20 Juyoung Yun , Sol Choi , Francois Rameau , Byungkon Kang , Zhoulai Fu

The use of reduced and mixed precision computing has gained increasing attention in high-performance computing (HPC) as a means to improve computational efficiency, particularly on modern hardware architectures like GPUs. In this work, we…

Computational Engineering, Finance, and Science · Computer Science 2025-05-28 Bálint Siklósi , Pushpender K. Sharma , David J. Lusher , István Z. Reguly , Neil D. Sandham

The use of low-precision fixed-point arithmetic along with stochastic rounding has been proposed as a promising alternative to the commonly used 32-bit floating point arithmetic to enhance training neural networks training in terms of…

Machine Learning · Computer Science 2018-04-17 Marc Ortiz , Adrián Cristal , Eduard Ayguadé , Marc Casas

Reduced precision computation for deep neural networks is one of the key areas addressing the widening compute gap driven by an exponential growth in model size. In recent years, deep learning training has largely migrated to 16-bit…

Machine Learning · Computer Science 2019-05-30 Naveen Mellempudi , Sudarshan Srinivasan , Dipankar Das , Bharat Kaul

In recent years, half precision floating-point arithmetic has gained wide support in hardware and software stack thanks to the advance of artificial intelligence and machine learning applications. Operating at half precision can…

Numerical Analysis · Mathematics 2024-09-19 Longfei Gao , Kevin Harms

Scientific machine learning (SciML) has emerged as a versatile approach to address complex computational science and engineering problems. Within this field, physics-informed neural networks (PINNs) and deep operator networks (DeepONets)…

Machine Learning · Computer Science 2024-01-31 Joel Hayford , Jacob Goldman-Wetzler , Eric Wang , Lu Lu

Mixed-precision algorithms have been proposed as a way for scientific computing to benefit from some of the gains seen for artificial intelligence (AI) on recent high performance computing (HPC) platforms. A few applications dominated by…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-16 Aditya Kashi , Nicholson Koukpaizan , Hao Lu , Michael Matheson , Sarp Oral , Feiyi Wang

Precision tuning or customized precision number representations is emerging, in these recent years, as one of the most promising techniques that has a positive impact on the footprint of programs concerning energy consumption, bandwidth…

Software Engineering · Computer Science 2022-03-16 Dorra Ben Khalifa , Matthieu Martel

The recently introduced Gradient Methods with Memory use a subset of the past oracle information to create an accurate model of the objective function that enables them to surpass the Gradient Method in practical performance. The model…

Optimization and Control · Mathematics 2024-01-30 Mihai I. Florea

Renewed interest in mixed-precision algorithms has emerged due to growing data capacity and bandwidth concerns, as well as the advancement of GPUs, which enable significant speedup for low precision arithmetic. In light of this, we propose…

Numerical Analysis · Mathematics 2020-12-14 Alec Michael Dunton , Alyson Fox

Neural network quantization is frequently used to optimize model size, latency and power consumption for on-device deployment of neural networks. In many cases, a target bit-width is set for an entire network, meaning every layer get…

Machine Learning · Computer Science 2023-02-13 Nilesh Prasad Pandey , Markus Nagel , Mart van Baalen , Yin Huang , Chirag Patel , Tijmen Blankevoort

Low precision arithmetic, in particular half precision floating point arithmetic, is now available in commercial hardware. Using lower precision can offer significant savings in computation and communication costs with proportional savings…

Numerical Analysis · Mathematics 2021-11-16 Eda Oktay , Erin Carson
‹ Prev 1 2 3 10 Next ›