Related papers: GPU Kernels for High-Speed 4-Bit Astrophysical Dat…
We present the design and implementation of a custom GPU-based compute cluster that provides the correlation X-engine of the CHIME Pathfinder radio telescope. It is among the largest such systems in operation, correlating 32,896 baselines…
The CHIME Pathfinder is a new interferometric radio telescope that uses a hybrid FPGA/GPU FX correlator. The GPU-based X-engine of this correlator processes over 819 Gb/s of 4+4-bit complex astronomical data from N=256 inputs across a 400…
We present a highly parallel implementation of the cross-correlation of time-series data using graphics processing units (GPUs), which is scalable to hundreds of independent inputs and suitable for the processing of signals from "Large-N"…
The increased bandwidth coupled with the large numbers of antennas of several new radio telescope arrays has resulted in an exponential increase in the amount of data that needs to be recorded and processed. In many cases, it is necessary…
There is growing interest in accelerating irregular data-parallel algorithms on GPUs. These algorithms are typically blocking, so they require fair scheduling. But GPU programming models (e.g.\ OpenCL) do not mandate fair scheduling, and…
We present direct astrophysical N-body simulations with up to a few million bodies using our parallel MPI/CUDA code on large GPU clusters in China, Ukraine and Germany, with different kinds of GPU hardware. These clusters are directly…
The paper considers the problem of implementation on graphics processors of numerical integration routines for higher order finite element approximations. The design of suitable GPU kernels is investigated in the context of general purpose…
We present an overview of the Graphics Processing Unit (GPU) based spatial processing system created for the Canadian Hydrogen Intensity Mapping Experiment (CHIME). The design employs AMD S9300x2 GPUs and readily-available commercial…
We present a Python/CuPy FX software correlator for small radio interferometer arrays and evaluate it on QUEST (Qilu University Explorer Survey Telescope). The system combines multi-threaded data ingest, pinned-memory host-device transfers,…
Next generation radio telescopes will require orders of magnitude more computing power to provide a view of the universe with greater sensitivity. In the initial stages of the signal processing flow of a radio telescope, signal correlation…
We develop and study FPGA implementations of algorithms for charged particle tracking based on graph neural networks. The two complementary FPGA designs are based on OpenCL, a framework for writing programs that execute across heterogeneous…
GPU-based beamforming is a relatively unexplored area in radio astronomy, possibly due to the assumption that any such system will be severely limited by the PCIe bandwidth required to transfer data to the GPU. We have developed a…
In this work, we have explored the advantages and drawbacks of using GPUs instead of CPUs in the calculation of a standard 2-point correlation function algorithm, which is useful for the analysis of Large Scale Structure of galaxies. Taking…
Modern graphics processing units (GPUs) are inexpensive commodity hardware that offer Tflop/s theoretical computing capacity. GPUs are well suited to many compute-intensive tasks including digital signal processing. We describe the…
We discuss an implementation of our 3D radiative transfer (3DRT) framework with the OpenCL paradigm for general GPU computing. We implement the kernel for solving the 3DRT problem in Cartesian coordinates with periodic boundary conditions…
At high energy physics experiments, processing billions of records of structured numerical data from collider events to a few statistical summaries is a common task. The data processing is typically more complex than standard query…
In the next decade, the demands for computing in large scientific experiments are expected to grow tremendously. During the same time period, CPU performance increases will be limited. At the CERN Large Hadron Collider (LHC), these two…
Searching for sources of electromagnetic emission in spectral-line radio astronomy interferometric data is a computationally intensive process. Parallel programming techniques and High Performance Computing hardware may be used to improve…
Computing on graphics processors is maybe one of the most important developments in computational science to happen in decades. Not since the arrival of the Beowulf cluster, which combined open source software with commodity hardware to…
A recent development in radio astronomy is to replace traditional dishes with many small antennas. The signals are combined to form one large, virtual telescope. The enormous data streams are cross-correlated to filter out noise. This is…