Related papers: Analysis of GPU Parallel Computing based on Matlab
A finite-difference Micromagnetic simulation code written in MATLAB is presented with Graphics Processing Unit (GPU) acceleration. The high performance of Graphics Processing Unit (GPU) is demonstrated compared to a typical Central…
We provide a preliminary study on utilizing GPU (Graphics Processing Unit) to accelerate computation for three simulation optimization tasks with either first-order or second-order algorithms. Compared to the implementation using only CPU…
Parallel computing can offer an enormous advantage regarding the performance for very large applications in almost any field: scientific computing, computer vision, databases, data mining, and economics. GPUs are high performance many-core…
Matrix multiplication is a foundational operation in scientific computing and machine learning, yet its computational complexity makes it a significant bottleneck for large-scale applications. The shift to parallel architectures, primarily…
Due to their highly parallel multi-cores architecture, GPUs are being increasingly used in a wide range of computationally intensive applications. Compared to CPUs, GPUs can achieve higher performances at accelerating the programs'…
In this work, we survey the role of GPUs in real-time systems. Originally designed for parallel graphics workloads, GPUs are now widely used in time-critical applications such as machine learning, autonomous vehicles, and robotics due to…
General Purpose Graphics Processing Unit (GPGPU) computing plays a transformative role in deep learning and machine learning by leveraging the computational advantages of parallel processing. Through the power of Compute Unified Device…
Parallel computing using accelerators has gained widespread research attention in the past few years. In particular, using GPUs for general purpose computing has brought forth several success stories with respect to time taken, cost, power,…
Scientific Computing relies on executing computer algorithms coded in some programming languages. Given a particular available hardware, algorithms speed is a crucial factor. There are many scientific computing environments used to code…
Parallel algorithms on CPU and GPU are implemented for the Unified Gas-Kinetic Scheme and their performances are investigated and compared by a two dimensional channel flow case. The parallel CPU algorithm has a one dimensional block…
Recent years have witnessed a rapid advancement in GPU technology, establishing it as a formidable high-performance parallel computing technology with superior floating-point computational capabilities compared to traditional CPUs. This…
Reduction operations are extensively employed in many computational problems. A reduction consists of, given a finite set of numeric elements, combining into a single value all elements in that set, using for this a combiner function. A…
Image Processing is a specialized area of Digital Signal Processing which contains various mathematical and algebraic operations such as matrix inversion, transpose of matrix, derivative, convolution, Fourier Transform etc. Operations like…
Many techniques in program synthesis, superoptimization, and array programming require parallel rollouts of general-purpose programs. GPUs, while capable targets for domain-specific parallelism, are traditionally underutilized by such…
The problem of solving a system of polynomial equations is one of the most fundamental problems in applied mathematics. Among them, the problem of solving a system of binomial equations form a important subclass for which specialized…
In this work, we present an extension of Gaussian process (GP) models with sophisticated parallelization and GPU acceleration. The parallelization scheme arises naturally from the modular computational structure w.r.t. datapoints in the…
The future of computation is the Graphical Processing Unit, i.e. the GPU. The promise that the graphics cards have shown in the field of image processing and accelerated rendering of 3D scenes, and the computational capability that these…
A previous study of MD algorithms designed for GPU use is extended to cover more recent developments in GPU architecture. Algorithm modifications are described, together with extensions to more complex systems. New measurements include the…
There has been significant amount of excitement and recent work on GPU-based database systems. Previous work has claimed that these systems can perform orders of magnitude better than CPU-based database systems on analytical workloads such…
Process mapping asks to assign vertices of a task graph to processing elements of a supercomputer such that the computational workload is balanced while the communication cost is minimized. Motivated by the recent success of GPU-based graph…