Related papers: Improving the Neural GPU Architecture for Algorith…
The Neural GPU is a recent model that can learn algorithms such as multi-digit binary addition and binary multiplication in a way that generalizes to inputs of arbitrary length. We show that there are two simple ways of improving the…
Learning an algorithm from examples is a fundamental problem that has been widely studied. Recently it has been addressed using neural networks, in particular by Neural Turing Machines (NTMs). These are fully differentiable computers that…
General Purpose Graphics Processing Unit (GPGPU) computing plays a transformative role in deep learning and machine learning by leveraging the computational advantages of parallel processing. Through the power of Compute Unified Device…
Deep learning is a technique for machine learning using multi-layer neural networks. It has been used for image synthesis and image recognition, but in recent years, it has also been used for various social detection and social labeling. In…
This paper describes a method for accelerating large scale Artificial Neural Networks (ANN) training using multi-GPUs by reducing the forward and backward passes to matrix multiplication. We propose an out-of-core multi-GPU matrix…
Graph embedding techniques have attracted growing interest since they convert the graph data into continuous and low-dimensional space. Effective graph analytic provides users a deeper understanding of what is behind the data and thus can…
The paper presents the aspect of use of modern graphics accelerators supporting CUDA technology for high-performance computing in the field of linear algebra. Fully programmable graphic cards have been available for several years for both…
Algorithms have been fundamental to recent global technological advances and, in particular, they have been the cornerstone of technical advances in one field rapidly being applied to another. We argue that algorithms possess fundamentally…
We propose a generic algorithmic building block to accelerate training of machine learning models on heterogeneous compute systems. Our scheme allows to efficiently employ compute accelerators such as GPUs and FPGAs for the training of…
With the rapid development of in-depth learning, neural network and deep learning algorithms have been widely used in various fields, e.g., image, video and voice processing. However, the neural network model is getting larger and larger,…
The Deep Graph Library (DGL) was designed as a tool to enable structure learning from graphs, by supporting a core abstraction for graphs, including the popular Graph Neural Networks (GNN). DGL contains implementations of all core graph…
Machine Learning with deep neural networks has transformed computational approaches to scientific and engineering problems. Central to many of these advancements are precisely tuned neural architectures that are tailored to the domains in…
Deep neural network has recently shown very promising applications in different research directions and attracted the industry attention as well. Although the idea was introduced in the past but just recently the main limitation of using…
The rapid growth of data size and accessibility in recent years has instigated a shift of philosophy in algorithm design for artificial intelligence. Instead of engineering algorithms by hand, the ability to learn composable systems…
Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and…
The paper considers the problem of implementation on graphics processors of numerical integration routines for higher order finite element approximations. The design of suitable GPU kernels is investigated in the context of general purpose…
A Multigrid Full Approximation Storage algorithm for solving Deep Residual Networks is developed to enable neural network parallelized layer-wise training and concurrent computational kernel execution on GPUs. This work demonstrates a 10.2x…
Machine learning has made tremendous progress in recent years and received large amounts of public attention. Though we are still far from designing a full artificially intelligent agent, machine learning has brought us many applications in…
As deep neural networks (DNNs) become deeper, the training time increases. In this perspective, multi-GPU parallel computing has become a key tool in accelerating the training of DNNs. In this paper, we introduce a novel methodology to…
Graphics processing units (GPUs) are the de facto standard for processing deep learning (DL) tasks. Meanwhile, GPU failures, which are inevitable, cause severe consequences in DL tasks: they disrupt distributed trainings, crash inference…