English
Related papers

Related papers: High-Accuracy Low-Precision Training

200 papers

Low precision operations can provide scalability, memory savings, portability, and energy efficiency. This paper proposes SWALP, an approach to low precision training that averages low-precision SGD iterates with a modified learning rate…

Machine Learning · Computer Science 2019-05-21 Guandao Yang , Tianyi Zhang , Polina Kirichenko , Junwen Bai , Andrew Gordon Wilson , Christopher De Sa

Low-precision arithmetic has had a transformative effect on the training of neural networks, reducing computation, memory and energy requirements. However, despite its promise, low-precision arithmetic has received little attention for…

Machine Learning · Computer Science 2022-07-15 Wesley J. Maddox , Andres Potapczynski , Andrew Gordon Wilson

In recent years, the fervent demand for computational power across various domains has prompted hardware manufacturers to introduce specialized computing hardware aimed at enhancing computational capabilities. Particularly, the utilization…

Numerical Analysis · Mathematics 2024-03-12 Hongyaoxing Gu

Structural pruning can simplify network architecture and improve inference speed. We propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a global resource allocation optimization problem, aiming at maximizing…

Computer Vision and Pattern Recognition · Computer Science 2021-10-22 Maying Shen , Hongxu Yin , Pavlo Molchanov , Lei Mao , Jianna Liu , Jose M. Alvarez

Low-precision training has become crucial for reducing the computational and memory costs of large-scale deep learning. However, quantizing gradients introduces magnitude shrinkage, which can change how stochastic gradient descent (SGD)…

Machine Learning · Computer Science 2026-01-09 Vincent-Daniel Yun

While low-precision optimization has been widely used to accelerate deep learning, low-precision sampling remains largely unexplored. As a consequence, sampling is simply infeasible in many large-scale scenarios, despite providing…

Machine Learning · Computer Science 2022-06-22 Ruqi Zhang , Andrew Gordon Wilson , Christopher De Sa

Despite impressive performance, deep neural networks require significant memory and computation costs, prohibiting their application in resource-constrained scenarios. Sparse training is one of the most common techniques to reduce these…

Machine Learning · Computer Science 2023-12-06 Bowen Lei , Dongkuan Xu , Ruqi Zhang , Shuren He , Bani K. Mallick

Recent trends in lower precision, e.g. half-precision floating point, training have shown improved system performance and reduced memory usage for Deep Learning while maintaining accuracy. However, current GNN systems cannot achieve such…

Machine Learning · Computer Science 2025-09-17 Arnab Kanti Tarafder , Yidong Gong , Pradeep Kumar

Tensor decomposition has been widely used in machine learning and high-volume data analysis. However, large-scale tensor factorization often consumes huge memory and computing cost. Meanwhile, modernized computing hardware such as tensor…

Optimization and Control · Mathematics 2022-09-12 Zi Yang , Junnan Shan , Zheng Zhang

Quantized training of Large Language Models (LLMs) remains an open challenge, as maintaining accuracy while performing all matrix multiplications in low precision has proven difficult. This is particularly the case when fine-tuning…

Machine Learning · Computer Science 2025-11-06 Saleh Ashkboos , Mahdi Nikdan , Soroush Tabesh , Roberto L. Castro , Torsten Hoefler , Dan Alistarh

Large-scale non-convex sparsity-constrained problems have recently gained extensive attention. Most existing deterministic optimization methods (e.g., GraSP) are not suitable for large-scale and high-dimensional problems, and thus…

Machine Learning · Computer Science 2019-12-03 Fanhua Shang , Bingkun Wei , Hongying Liu , Yuanyuan Liu , Jiacheng Zhuo

Learning rate adaptation is a popular topic in machine learning. Gradient Descent trains neural nerwork with a fixed learning rate. Learning rate adaptation is proposed to accelerate the training process through adjusting the step size in…

Machine Learning · Computer Science 2022-10-20 Bozhou Chen , Hongzhi Wang , Chenmin Ba

Stochastic optimization algorithms are widely used for machine learning with large-scale data. However, their convergence often suffers from non-vanishing variance. Variance Reduction (VR) methods, such as SVRG and SARAH, address this issue…

Machine Learning · Computer Science 2026-01-12 Daniil Medyakov , Gleb Molodtsov , Savelii Chezhegov , Alexey Rebrikov , Aleksandr Beznosikov

State-of-the-art training algorithms for deep learning models are based on stochastic gradient descent (SGD). Recently, many variations have been explored: perturbing parameters for better accuracy (such as in Extragradient), limiting SGD…

Machine Learning · Computer Science 2022-03-23 Amirkeivan Mohtashami , Martin Jaggi , Sebastian U. Stich

This paper presents the impact of using quantization on the efficiency of multi-class text classification in the training process of a support vector machine (SVM). This work is focused on comparing the efficiency of SVM model trained using…

Machine Learning · Computer Science 2020-07-20 Dominik Żurek , Marcin Pietroń

Quantization reduces computation costs of neural networks but suffers from performance degeneration. Is this accuracy drop due to the reduced capacity, or inefficient training during the quantization procedure? After looking into the…

Computer Vision and Pattern Recognition · Computer Science 2019-12-24 Qing Jin , Linjie Yang , Zhenyu Liao

Structural pruning can simplify network architecture and improve inference speed. We propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a global resource allocation optimization problem, aiming at maximizing…

Computer Vision and Pattern Recognition · Computer Science 2022-10-20 Maying Shen , Hongxu Yin , Pavlo Molchanov , Lei Mao , Jianna Liu , Jose M. Alvarez

Embedding tables are usually huge in click-through rate (CTR) prediction models. To train and deploy the CTR models efficiently and economically, it is necessary to compress their embedding tables at the training stage. To this end, we…

Machine Learning · Computer Science 2024-08-07 Shiwei Li , Huifeng Guo , Lu Hou , Wei Zhang , Xing Tang , Ruiming Tang , Rui Zhang , Ruixuan Li

LLM training is resource-intensive. Quantized training improves computational and memory efficiency but introduces quantization noise, which can hinder convergence and degrade model accuracy. Stochastic Rounding (SR) has emerged as a…

Machine Learning · Computer Science 2025-11-04 Taowen Liu , Marta Andronic , Deniz Gündüz , George A. Constantinides

For time-critical IoT applications using deep learning, inference acceleration through distributed computing is a promising approach to meet a stringent deadline. In this paper, we implement a working prototype of a new distributed…

Computer Vision and Pattern Recognition · Computer Science 2022-11-29 Zhongtian Dong , Nan Li , Alexandros Iosifidis , Qi Zhang
‹ Prev 1 2 3 10 Next ›