English
Related papers

Related papers: Evolving the COLA software library

200 papers

In recent history, GPUs became a key driver of compute performance in HPC. With the installation of the Frontier supercomputer, they became the enablers of the Exascale era; further largest-scale installations are in progress (Aurora, El…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-05 Andreas Herten

Modern graphics hardware is designed for highly parallel numerical tasks and provides significant cost and performance benefits. Graphics hardware vendors are now making available development tools to support general purpose high…

High Energy Physics - Lattice · Physics 2009-01-22 Kipton Barros , Ronald Babich , Richard Brower , Michael A. Clark , Claudio Rebbi

We present a comparison of several modern C++ libraries providing high-level interfaces for programming multi- and many-core architectures on top of CUDA or OpenCL. The comparison focuses on the solution of ordinary differential equations…

Mathematical Software · Computer Science 2017-10-13 Denis Demidov , Karsten Ahnert , Karl Rupp , Peter Gottschling

NVIDIA has been the main provider of GPU hardware in HPC systems for over a decade. Most applications that benefit from GPUs have thus been developed and optimized for the NVIDIA software stack. Recent exascale HPC systems are, however,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-12 Igor Sfiligoi , Emily A. Belli , Jeff Candy , Reuben D. Budiardja

The SX-Aurora TSUBASA PCIe accelerator card is the newest model of NEC's SX architecture family. Its multi-core vector processor features a vector length of 16 kbits and interfaces with up to 48 GB of HBM2 memory in the current models,…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-04 Benjamin Huth , Nils Meyer , Tilo Wettig

Supercomputers become faster as hardware and software technologies continue to evolve. Current supercomputers are capable of 1015 floating point operations per second (FLOPS) that called Petascale system. The High Performance Computer (HPC)…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-09-27 Jalal Abdulbaqi

Decentralized machine learning is a promising emerging paradigm in view of global challenges of data ownership and privacy. We consider learning of linear classification and regression models, in the setting where the training data is…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-20 Lie He , An Bian , Martin Jaggi

Bioinformatics and Computational Biology are two fields that have been exploiting GPUs for more than two decades, being CUDA the most used programming language for them. However, as CUDA is an NVIDIA proprietary language, it implies a…

Programming Languages · Computer Science 2024-02-27 Manuel Costanzo , Enzo Rucci , Carlos García Sánchez , Marcelo Naiouf , Manuel Prieto-Matías

The inability to predict lasting languages and architectures led us to develop OCCA, a C++ library focused on host-device interaction. Using run-time compilation and macro expansions, the result is a novel single kernel language that…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-03-06 David S Medina , Amik St-Cyr , T. Warburton

The presence of GPU from different vendors demands the Lattice QCD codes to support multiple architectures. To this end, Open Computing Language (OpenCL) is one of the viable frameworks for writing a portable code. It is of interest to find…

High Energy Physics - Lattice · Physics 2025-02-06 Piyush Kumar , Szabolcs Borsanyi , Jana N. Guenther , Chik Him Wong

This work explores the feasibility of specialized hardware implementing the Cortical Learning Algorithm (CLA) in order to fully exploit its inherent advantages. This algorithm, which is inspired in the current understanding of the mammalian…

Emerging Technologies · Computer Science 2024-05-06 Valentin Puente , José Ángel Gregorio

FPGAs have found increasing adoption in data center applications since a new generation of high-level tools have become available which noticeably reduce development time for FPGA accelerators and still provide high quality of results.…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-15 Marius Meyer , Tobias Kenter , Christian Plessl

As CUDA programs become the de facto program among data parallel applications such as high-performance computing or machine learning applications, running CUDA on other platforms has been a compelling option. Although several efforts have…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-21 Ruobing Han , Jaewon Lee , Jaewoong Sim , Hyesoon Kim

In recent years the computational capacity of single Field Programmable Gate Arrays (FPGA) devices as well as their versatility has increased significantly. Adding to that the High Level Synthesis frameworks allowing to program such…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-22 G. Korcyl , P. Korcyl

Recent increases in supercomputing power, driven by the multi-core revolution and accelerators such as the IBM Cell processor, graphics processing units (GPUs) and Intel's Many Integrated Core (MIC) technology have enabled kinetic…

Using commodity component personal computers based on Alpha processor and commodity network devices and a switch, we built an 8-node parallel computer. GNU/Linux is chosen as an operating system and message passing libraries such as PVM,…

High Energy Physics - Lattice · Physics 2015-06-25 Seyong Kim

We define a benchmark suite for lattice QCD and report on benchmark results from several computer platforms. The platforms considered are apeNEXT, CRAY T3E, Hitachi SR8000, IBM p690, PC-Clusters, and QCDOC.

High Energy Physics - Lattice · Physics 2009-11-10 M. Hasenbusch , K. Jansen , D. Pleiter , H. St"uben , P. Wegner , T. Wettig , H. Wittig

The heterogeneous computing paradigm has led to the need for portable and efficient programming solutions that can leverage the capabilities of various hardware devices, such as NVIDIA, Intel, and AMD GPUs. This study evaluates the…

Programming Languages · Computer Science 2023-11-13 Manuel Costanzo , Enzo Rucci , Carlos García Sánchez , Marcelo Naiouf , Manuel Prieto-Matías

Heterogeneity is the prevalent trend in the rapidly evolving high-performance computing (HPC) landscape in both hardware and application software. The diversity in hardware platforms, currently comprising various accelerators and a future…

Numerical Analysis · Mathematics 2025-07-15 Youngjun Lee , Klaus Weide , Wesley Kwiecinski , Jared O'Neal , Johann Rudi , Anshu Dubey

The world's largest particle accelerator, located at CERN, produces petabytes of data that need to be analysed efficiently, to study the fundamental structures of our universe. ROOT is an open-source C++ data analysis framework, developed…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-07 Jolly Chen , Monica Dessole , Ana Lucia Varbanescu
‹ Prev 1 2 3 10 Next ›