Related papers: A Hybrid Multi-GPU Implementation of Simplex Algor…
The Simplex tableau has been broadly used and investigated in the industry and academia. With the advent of the big data era, ever larger problems are posed to be solved in ever larger machines whose architecture type did not exist in the…
Matrix multiplication is a foundational operation in scientific computing and machine learning, yet its computational complexity makes it a significant bottleneck for large-scale applications. The shift to parallel architectures, primarily…
We propose a new hybrid topology optimization algorithm based on multigrid approach that combines the parallelization strategy of CPU using OpenMP and heavily multithreading capabilities of modern Graphics Processing Units (GPU). In…
Parallel computing using accelerators has gained widespread research attention in the past few years. In particular, using GPUs for general purpose computing has brought forth several success stories with respect to time taken, cost, power,…
OpenCL, along with CUDA, is one of the main tools used to program GPGPUs. However, it allows running the same code on multi-core CPUs too, making it a rival for the long-established OpenMP. In this paper we compare OpenCL and OpenMP when…
Graphics Processing Units (GPUs) with high computational capabilities used as modern parallel platforms to deal with complex computational problems. We use this platform to solve large-scale linear programing problems by revised simplex…
Linear Programs (LPs) appear in a large number of applications and offloading them to a GPU is viable to gain performance. Existing work on offloading and solving an LP on a GPU suggests that there is performance gain generally on large…
Heterogeneous systems are becoming more common on High Performance Computing (HPC) systems. Even using tools like CUDA and OpenCL it is a non-trivial task to obtain optimal performance on the GPU. Approaches to simplifying this task include…
Parallel computing is a standard approach to achieving high-performance computing (HPC). Three commonly used methods to implement parallel computing include: 1) applying multithreading technology on single-core or multi-core CPUs; 2)…
The vision of super computer at every desk can be realized by powerful and highly parallel CPUs or GPUs or APUs. Graphics processors once specialized for the graphics applications only, are now used for the highly computational intensive…
Linear Programming (LP) is a foundational optimization technique with widespread applications in finance, energy trading, and supply chain logistics. However, traditional Central Processing Unit (CPU)-based LP solvers often struggle to meet…
Experience shows that on today's high performance systems the utilization of different acceleration cards in conjunction with a high utilization of all other parts of the system is difficult. Future architectures, like exascale clusters,…
Parallel computing can offer an enormous advantage regarding the performance for very large applications in almost any field: scientific computing, computer vision, databases, data mining, and economics. GPUs are high performance many-core…
Current computational systems are heterogeneous by nature, featuring a combination of CPUs and GPUs. As the latter are becoming an established platform for high-performance computing, the focus is shifting towards the seamless programming…
This paper presents a comprehensive comparison of three dominant parallel programming models in High Performance Computing (HPC): Message Passing Interface (MPI), Open Multi-Processing (OpenMP), and Compute Unified Device Architecture…
Hybrid computational architectures based on the joint power of Central Processing Units and Graphic Processing Units (GPUs) are becoming popular and powerful hardware tools for a wide range of simulations in biology, chemistry, engineering,…
The main objective of this work consists in analyzing sub-structuring method for the parallel solution of sparse linear systems with matrices arising from the discretization of partial differential equations such as finite element, finite…
Future experiments in high-energy physics will pose stringent requirements to computing, in particular to real-time data processing. As an example, the CBM experiment at FAIR Germany intends to perform online data selection exclusively in…
Designing problems using matrices is very important in Computer Science. Fields like graph computer, graphs theory, and machine learning use matrices very often to solve their own problems. The most often matrix operation is the…
Sustaining a large fraction of single GPU performance in parallel computations is considered to be the major problem of GPU-based clusters. In this article, this topic is addressed in the context of a lattice Boltzmann flow solver that is…