Related papers: Enabling parallel computing in CRASH
In this paper we present CRASH_alpha, the first radiative transfer code for cosmological application that follows the parallel propagation of Ly_alpha and ionizing photons. CRASH_alpha is a version of the continuum radiative transfer code…
In this paper we report on the improvements implemented in the cosmological radiative transfer code CRASH. In particular we present a new multi-frequency algorithm for spectra sampling which makes use of colored photon packets: we discuss…
We present a largely improved version of CRASH, a 3-D radiative transfer code that treats the effects of ionizing radiation propagating through a given inhomogeneous H/He cosmological density field, on the physical conditions of the gas.…
Here we introduce CRASH3, the latest release of the 3D radiative transfer code CRASH. In its current implementation CRASH3 integrates into the reference algorithm the code Cloudy to evaluate the ionisation states of metals,…
We introduce CRASH-AMR, a new version of the cosmological Radiative Transfer (RT) code CRASH, enabled to use refined grids. This new feature allows us to attain higher resolution in our RT simulations and thus to describe more accurately…
We present a new numerical scheme to solve the transfer of diffuse radiation on three-dimensional mesh grids which is efficient on processors with highly parallel architecture such as recently popular GPUs and CPUs with multi- and many-core…
We describe the design, implementation and performance of the new hybrid parallelization scheme in our Monte Carlo radiative transfer code SKIRT, which has been used extensively for modeling the continuum radiation of dusty astrophysical…
We introduce an error resilient distributed computing method based on an extension of the channel polarization phenomenon to distributed algorithms. The method leverages an algorithmic split operation that transforms two identical compute…
We present the methodology of a photon-conserving, spatially-adaptive, ray-tracing radiative transfer algorithm, designed to run on multiple parallel Graphic Processing Units (GPUs). Each GPU has thousands computing cores, making them…
We describe a parallel version of our tree-code for the simulation of self-gravitating systems in Astrophysics. It is based on a dynamic and adaptive method for the domain decomposition, which exploits the hierarchical data arrangement used…
We have developed a cosmic ray (CR) shock code in one dimensional spherical geometry with which the particle distribution, the gas flow and their nonlinear interaction can be followed numerically in a frame comoving with an expanding shock.…
We report the results of intensive numerical calculations for four atomic H2+H2 energy transfer collision. A parallel computing technique based on LAM/MPI functions is used. In this algorithm, the data is distributed to the processors…
Multi-core architectures feature an intricate hierarchy of cache memories, with multiple levels and sizes. To adequately decompose an application according to the traits of a particular memory hierarchy is a cumbersome task that may be…
We describe a new parallel N-body code for cosmological simulations. The code is based on a work- and data sharing scheme, and is implemented within the Cray Research Corporation's CRAFT programming environment. Different data distribution…
Traditional parallel schedulers running on cluster supercomputers support only static scheduling, where the number of processors allocated to an application remains fixed throughout the execution of the job. This results in…
Diffusion models have emerged as state-of-the-art in image generation, but their practical deployment is hindered by the significant computational cost of their iterative denoising process. While existing caching techniques can accelerate…
We describe the CRASH (Center for Radiative Shock Hydrodynamics) code, a block adaptive mesh code for multi-material radiation hydrodynamics. The implementation solves the radiation diffusion model with the gray or multigroup method and…
We present a parallel implementation of the particle-particle/particle-mesh (P3M) algorithm for distributed memory clusters. The GRACOS (GRAvitational COSmology) code uses a hybrid method for both computation and domain decomposition.…
We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations. The proposed mechanism integrates the concepts of randomized sketching…
Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. The large data sizes of graphs and their vertex features make scalable training algorithms and distributed memory systems necessary. Since the…