English
Related papers

Related papers: Optimizing Large-Scale ODE Simulations

200 papers

The task of integrating a large number of independent ODE systems arises in various scientific and engineering areas. For nonstiff systems, common explicit integration algorithms can be used on GPUs, where individual GPU threads…

Mathematical Software · Computer Science 2016-11-09 Kyle E Niemeyer , Chih-Jen Sung

Quantum circuit simulation is crucial for the development of quantum algorithms, particularly given the high cost and noise limitations of physical quantum hardware. While full-state quantum circuit simulation is commonly employed for…

Quantum Physics · Physics 2026-04-15 Chuan-Chi Wang , Yan-Jie Wang , Chia-Heng Tu , Shih-Hao Hung

Efficient ordinary differential equation solvers for chemical kinetics must take into account the available thread and instruction-level parallelism of the underlying hardware, especially on many-core coprocessors, as well as the numerical…

Computational Physics · Physics 2018-03-28 Christopher P. Stone , Andrew T. Alferman , Kyle E. Niemeyer

Mixed-precision methods combine low and high precision arithmetics to exploit low precision computational speed and high precision accuracy. Large ODE systems that contain many heterogeneous interactions lead to a high computational cost…

Numerical Analysis · Mathematics 2026-05-25 Mouhamad Al-Sayed , Samuel Bernard , Arsène Marzorati , Jonathan Rouzaud-Cornabas

The emergence of Big Data in recent years has resulted in a growing need for efficient data processing solutions. While infrastructures with sufficient compute power are available, the I/O bottleneck remains. The Linux page cache is an…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-06 Hoang-Dung Do , Valerie Hayot-Sasson , Rafael Ferreira da Silva , Christopher Steele , Henri Casanova , Tristan Glatard

Quantum computing has garnered attention for its potential to solve complex computational problems with considerable speedup. Despite notable advancements in the field, achieving meaningful scalability and noise control in quantum hardware…

Quantum Physics · Physics 2025-05-12 Eduardo Willwock Lussi , Rafael de Santiago , Eduardo Inacio Duzzioni

The classical simulation of quantum algorithms is a crucial tool for circuit development, testing, and validation. Although acceleration using GPUs significantly reduces simulation time, most high-performance simulators rely on…

Two factors, which affect simulation quality are the amount of computing power and implementation. The Streaming SIMD (single instruction multiple data) extensions (SSE) present a technique for influencing both by exploiting the processor's…

Computational Engineering, Finance, and Science · Computer Science 2013-09-04 Shyam Srinivasan

A general purpose, modular program package for the integration of large number of independent ordinary differential equation systems capable of using professional graphics cards is presented. The available numerical schemes are the explicit…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-10-10 Ferenc Hegedűs

We propose an online auto-tuning approach for computing kernels. Differently from existing online auto-tuners, which regenerate code with long compilation chains from the source to the binary code, our approach consists on deploying…

Performance · Computer Science 2017-07-17 Fernando Endo , Damien Couroussé , Henri-Pierre Charles

CUDA kernel optimization has become a critical bottleneck for AI performance, as deep learning training and inference efficiency directly depends on highly optimized GPU kernels. Despite the promise of Large Language Models (LLMs) for…

Machine Learning · Computer Science 2025-10-07 Ping Guo , Chenyu Zhu , Siyuan Chen , Fei Liu , Xi Lin , Zhichao Lu , Qingfu Zhang

With the growing number of data-intensive workloads, GPU, which is the state-of-the-art single-instruction-multiple-thread (SIMT) processor, is hindered by the memory bandwidth wall. To alleviate this bottleneck, previously proposed…

Hardware Architecture · Computer Science 2021-03-12 Xinfeng Xie , Peng Gu , Yufei Ding , Dimin Niu , Hongzhong Zheng , Yuan Xie

Quantum computers are becoming practical for computing numerous applications. However, simulating quantum computing on classical computers is still demanding yet useful because current quantum computers are limited because of computer…

Quantum Physics · Physics 2023-08-08 Jun Doi , Hiroshi Horii , Christopher Wood

There exist many Runge-Kutta methods (explicit or implicit), more or less adapted to specific problems. Some of them have interesting properties, such as stability for stiff problems or symplectic capability for problems with energy…

Numerical Analysis · Mathematics 2018-04-16 Julien Alexandre dit Sandretto

The supercomputing platforms available for high performance computing based research evolve at a great rate. However, this rapid development of novel technologies requires constant adaptations and optimizations of the existing codes for…

High Energy Physics - Lattice · Physics 2017-02-23 Marina Krstic Marinkovic , Luka Stanisic

We focus on implementing and optimizing a sixth-order finite-difference solver for simulating compressible fluids on a GPU using third-order Runge-Kutta integration. Since graphics processing units perform well in data-parallel tasks, this…

Computational Physics · Physics 2017-07-28 Johannes Pekkilä , Miikka S. Väisälä , Maarit J. Käpylä , Petri J. Käpylä , Omer Anjum

Simulation code for conventional supercomputers serves as a reference for neuromorphic computing systems. The present bottleneck of distributed large-scale spiking neuronal network simulations is the communication between compute nodes.…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-27 Melissa Lober , Markus Diesmann , Susanne Kunkel

One of the main purposes of discrete event simulators such as OMNeT++ is to test new algorithms or protocols in realistic environments. These often need to be benchmarked against optimal/theoretical results obtained by running commercial…

Networking and Internet Architecture · Computer Science 2015-09-14 Antonio Virdis

Scaling test-time compute has emerged as an effective strategy for improving the performance of large language models. However, existing methods typically allocate compute uniformly across all queries, overlooking variation in query…

Artificial Intelligence · Computer Science 2026-04-24 Bowen Zuo , Yinglun Zhu

As recurrent neural networks become larger and deeper, training times for single networks are rising into weeks or even months. As such there is a significant incentive to improve the performance and scalability of these networks. While…

Machine Learning · Computer Science 2016-04-08 Jeremy Appleyard , Tomas Kocisky , Phil Blunsom
‹ Prev 1 2 3 10 Next ›