Related papers: Dynamic Tensor Rematerialization

Optimal checkpointing for heterogeneous chains: how to train deep neural networks with limited memory

This paper introduces a new activation checkpointing method which allows to significantly decrease memory usage when training Deep Neural Networks with the back-propagation algorithm. Similarly to checkpoint-ing techniques coming from the…

Machine Learning · Computer Science 2019-12-02 Julien Herrmann , Olivier Beaumont , Lionel Eyraud-Dubois , Julien Hermann , Alexis Joly , Alena Shilova

Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization

We formalize the problem of trading-off DNN training time and memory requirements as the tensor rematerialization optimization problem, a generalization of prior checkpointing strategies. We introduce Checkmate, a system that solves for…

Machine Learning · Computer Science 2020-05-15 Paras Jain , Ajay Jain , Aniruddha Nrusimha , Amir Gholami , Pieter Abbeel , Kurt Keutzer , Ion Stoica , Joseph E. Gonzalez

A Study of Checkpointing in Large Scale Training of Deep Neural Networks

Deep learning (DL) applications are increasingly being deployed on HPC systems, to leverage the massive parallelism and computing power of those systems for DL model training. While significant effort has been put to facilitate distributed…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-30 Elvis Rojas , Albert Njoroge Kahira , Esteban Meneses , Leonardo Bautista Gomez , Rosa M Badia

Coop: Memory is not a Commodity

Tensor rematerialization allows the training of deep neural networks (DNNs) under limited memory budgets by checkpointing the models and recomputing the evicted tensors as needed. However, the existing tensor rematerialization techniques…

Machine Learning · Computer Science 2023-11-02 Jianhao Zhang , Shihan Ma , Peihong Liu , Jinhui Yuan

BlendTorch: A Real-Time, Adaptive Domain Randomization Library

Solving complex computer vision tasks by deep learning techniques relies on large amounts of (supervised) image data, typically unavailable in industrial environments. The lack of training data starts to impede the successful transfer of…

Computer Vision and Pattern Recognition · Computer Science 2020-10-23 Christoph Heindl , Lukas Brunner , Sebastian Zambal , Josef Scharinger

XEngine: Optimal Tensor Rematerialization for Neural Networks in Heterogeneous Environments

Memory efficiency is crucial in training deep learning networks on resource-restricted devices. During backpropagation, forward tensors are used to calculate gradients. Despite the option of keeping those dependencies in memory until they…

Machine Learning · Computer Science 2022-12-22 Manuela Schuler , Richard Membarth , Philipp Slusallek

Neural Networks for Full Phase-space Reweighting and Parameter Tuning

Precise scientific analysis in collider-based particle physics is possible because of complex simulations that connect fundamental theories to observable quantities. The significant computational cost of these programs limits the scope,…

High Energy Physics - Phenomenology · Physics 2020-05-20 Anders Andreassen , Benjamin Nachman

REANN: A PyTorch-based End-to-End Multi-functional Deep Neural Network Package for Molecular, Reactive and Periodic Systems

In this work, we present a general purpose deep neural network package for representing energies, forces, dipole moments, and polarizabilities of atomistic systems. This so-called recursively embedded atom neural network model takes both…

Chemical Physics · Physics 2022-04-06 Yaolong Zhang , Junfan Xia , Bin Jiang

Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method

Although advancements in deep learning have significantly enhanced the recommendation accuracy of deep recommendation models, these methods still suffer from low recommendation efficiency. Recently proposed tree-based deep recommendation…

Information Retrieval · Computer Science 2026-01-29 Ze Liu , Jin Zhang , Chao Feng , Defu Lian , Jie Wang , Enhong Chen

A Tunable Robust Pruning Framework Through Dynamic Network Rewiring of DNNs

This paper presents a dynamic network rewiring (DNR) method to generate pruned deep neural network (DNN) models that are robust against adversarial attacks yet maintain high accuracy on clean images. In particular, the disclosed DNR method…

Computer Vision and Pattern Recognition · Computer Science 2020-11-25 Souvik Kundu , Mahdi Nazemi , Peter A. Beerel , Massoud Pedram

DTR: A Unified Deep Tensor Representation Framework for Multimedia Data Recovery

Recently, the transform-based tensor representation has attracted increasing attention in multimedia data (e.g., images and videos) recovery problems, which consists of two indispensable components, i.e., transform and characterization.…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 Ting-Wei Zhou , Xi-Le Zhao , Jian-Li Wang , Yi-Si Luo , Min Wang , Xiao-Xuan Bai , Hong Yan

Learning without feedback: Fixed random learning signals allow for feedforward training of deep neural networks

While the backpropagation of error algorithm enables deep neural network training, it implies (i) bidirectional synaptic weight transport and (ii) update locking until the forward and backward passes are completed. Not only do these…

Machine Learning · Statistics 2021-01-19 Charlotte Frenkel , Martin Lefebvre , David Bol

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

Many challenging real-world problems require the deployment of ensembles multiple complementary learning models to reach acceptable performance levels. While effective, applying the entire ensemble to every sample is costly and often…

Cryptography and Security · Computer Science 2022-09-20 Orel Lavie , Asaf Shabtai , Gilad Katz

Deep Predictive Policy Training using Reinforcement Learning

Skilled robot task learning is best implemented by predictive action policies due to the inherent latency of sensorimotor processes. However, training such predictive policies is challenging as it involves finding a trajectory of motor…

Robotics · Computer Science 2017-03-03 Ali Ghadirzadeh , Atsuto Maki , Danica Kragic , Mårten Björkman

Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction

In dynamical systems reconstruction (DSR) we seek to infer from time series measurements a generative model of the underlying dynamical process. This is a prime objective in any scientific discipline, where we are particularly interested in…

Machine Learning · Computer Science 2024-06-10 Christoph Jürgen Hemmer , Manuel Brenner , Florian Hess , Daniel Durstewitz

DPTDR: Deep Prompt Tuning for Dense Passage Retrieval

Deep prompt tuning (DPT) has gained great success in most natural language processing~(NLP) tasks. However, it is not well-investigated in dense retrieval where fine-tuning~(FT) still dominates. When deploying multiple retrieval tasks using…

Computation and Language · Computer Science 2022-08-25 Zhengyang Tang , Benyou Wang , Ting Yao

Deep Residual Reinforcement Learning

We revisit residual algorithms in both model-free and model-based reinforcement learning settings. We propose the bidirectional target network technique to stabilize residual algorithms, yielding a residual version of DDPG that…

Machine Learning · Computer Science 2020-01-27 Shangtong Zhang , Wendelin Boehmer , Shimon Whiteson

Fixed-point optimization of deep neural networks with adaptive step size retraining

Fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations. Many deep neural networks show fairly good performance even with 2- or 3-bit precision when quantized weights…

Machine Learning · Computer Science 2017-02-28 Sungho Shin , Yoonho Boo , Wonyong Sung

Boost Neural Networks by Checkpoints

Training multiple deep neural networks (DNNs) and averaging their outputs is a simple way to improve the predictive performance. Nevertheless, the multiplied training cost prevents this ensemble method to be practical and efficient. Several…

Machine Learning · Computer Science 2021-10-27 Feng Wang , Guoyizhe Wei , Qiao Liu , Jinxiang Ou , Xian Wei , Hairong Lv

Revisiting hard thresholding for DNN pruning

The most common method for DNN pruning is hard thresholding of network weights, followed by retraining to recover any lost accuracy. Recently developed smart pruning algorithms use the DNN response over the training set for a variety of…

Machine Learning · Computer Science 2019-05-23 Konstantinos Pitas , Mike Davies , Pierre Vandergheynst