Related papers: GPU Accelerated Particle Visualization with Splotc…
The scientific community is presently witnessing an unprecedented growth in the quality and quantity of data sets coming from simulations and real-world experiments. To access effectively and extract the scientific content of such…
Graphics processing units (GPU) had evolved from a specialized hardware capable to render high quality graphics in games to a commodity hardware for effective processing blocks of data in a parallel schema. This evolution is particularly…
With the increasing size and complexity of data produced by large scale numerical simulations, it is of primary importance for scientists to be able to exploit all available hardware in heterogenous High Performance Computing environments…
We present a light and fast, public available, ray-tracer {\tt Splotch} software tool which supports the effective visualization of cosmological simulations data. We describe the algorithm it relies on, which is designed in order to deal…
Structural parameters are normally extracted from observed galaxies by fitting analytic light profiles to the observations. Obtaining accurate fits to high-resolution images is a computationally expensive task, requiring many model…
Modeling data sharing in GPU programs is a challenging task because of the massive parallelism and complex data sharing patterns provided by GPU architectures. Better GPU caching efficiency can be achieved through careful task scheduling…
Dynamic programming (DP) is a cornerstone of combinatorial optimization, yet its inherently sequential structure has long limited its scalability in scenario-based stochastic programming (SP). This paper introduces a GPU-accelerated…
The optimization of submodular functions constitutes a viable way to perform clustering. Strong approximation guarantees and feasible optimization w.r.t. streaming data make this clustering approach favorable. Technically, submodular…
In this work, we present an extension of Gaussian process (GP) models with sophisticated parallelization and GPU acceleration. The parallelization scheme arises naturally from the modular computational structure w.r.t. datapoints in the…
The High Performance Computing (HPC) field is witnessing a widespread adoption of Graphics Processing Units (GPUs) as co-processors for conventional homogeneous clusters. The adoption of prevalent Single- Program Multiple-Data (SPMD)…
Subgraph matching is a core operation in graph analytics, supporting a broad spectrum of applications from social network analysis to bioinformatics. Recent GPU-based approaches accelerate subgraph matching by leveraging parallelism but…
Stochastic algorithms are efficient approaches to solving machine learning and optimization problems. In this paper, we propose a general framework called Splash for parallelizing stochastic algorithms on multi-node distributed systems.…
Parallel computing is a standard approach to achieving high-performance computing (HPC). Three commonly used methods to implement parallel computing include: 1) applying multithreading technology on single-core or multi-core CPUs; 2)…
The future of computation is the Graphical Processing Unit, i.e. the GPU. The promise that the graphics cards have shown in the field of image processing and accelerated rendering of 3D scenes, and the computational capability that these…
Branch-and-Bound (B&B) algorithms are time intensive tree-based exploration methods for solving to optimality combinatorial optimization problems. In this paper, we investigate the use of GPU computing as a major complementary way to speed…
Ray tracing is a technique for generating an image by tracing the path of light through pixels in an image plane and simulating the effects of high-quality global illumination at a heavy computational cost. Because of the high computation…
The simulation of the two-dimensional Ising model is used as a benchmark to show the computational capabilities of Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities…
We develop a novel linear-complexity bottom-up sketching-based algorithm for constructing a $H^2$ matrix, and present its high performance GPU implementation. The construction algorithm requires both a black-box sketching operator and an…
X-ray ptychography imaging at synchrotron facilities like the Advanced Photon Source (APS) involves controlling instrument hardwares to collect a set of diffraction patterns from overlapping coherent illumination spots on extended samples,…
Smoothed Particle Hydrodynamics (SPH) is essential for modeling complex large-deformation problems across various applications, requiring significant computational power. A major portion of SPH computation time is dedicated to the Nearest…