English
Related papers

Related papers: OMI4papps: Optimisation, Modelling and Implementat…

200 papers

Monolithic Active Pixel Sensors (MAPS) combine the sensing part and the front-end electronics in the same silicon layer, making use of CMOS technology. Profiting from the progresses of this commercial process, MAPS have been undergoing…

Instrumentation and Detectors · Physics 2024-08-06 Domenico Colella

Exascale computing systems will exhibit high degrees of hierarchical parallelism, with thousands of computing nodes and hundreds of cores per node. Efficiently exploiting hierarchical parallelism is challenging due to load imbalance that…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-29 Jonas H. Müller Korndörfer , Ahmed Eleliemy , Ali Mohammed , Florina M. Ciorba

Reactive molecular dynamics simulations are computationally demanding. Reaching spatial and temporal scales where interesting scientific phenomena can be observed requires efficient and scalable implementations on modern hardware. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-06-26 Hasan Metin Aktulga , Christopher Knight , Paul Coffman , Kurt A. O'Hearn , Tzu-Ray Shan , Wei Jiang

Nowadays, latency-critical, high-performance applications are parallelized even on power-constrained client systems to improve performance. However, an important scenario of fine-grained tasking on simultaneous multithreading CPU cores in…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-03 Denis Los , Igor Petushkov

The Intel Xeon Phi manycore processor is designed to provide high performance matrix computations of the type often performed in data analysis. Common data analysis environments include Matlab, GNU Octave, Julia, Python, and R. Achieving…

Efficient implementations of the classical molecular dynamics (MD) method for Lennard-Jones particle systems are considered. Not only general algorithms but also techniques that are efficient for some specific CPU architectures are also…

Statistical Mechanics · Physics 2015-03-17 H. Watanabe , M. Suzuki , N. Ito

Heterogeneous computing is emerging as a mandatory requirement for power-efficient system design. With this aim, modern heterogeneous platforms like Zynq All-Programmable SoC, that integrates ARM-based SMP and programmable logic, have been…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-08-28 Daniel Jiménez-González , Carlos Álvarez , Antonio Filgueras , Xavier Martorell , Jan Langer , Juanjo Noguera , Kees Vissers

Finding the sparsest solution to the underdetermined system $\mathbf{y}=\mathbf{Ax}$, given a tolerance, is known to be NP-hard. Many approximate solutions to this problem exist, and Orthogonal Matching Pursuit (OMP) is one of the most…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-01 Ariel Lubonja , Sebastian Kazmarek Praesius , Trac Duy Tran

This paper presents implementation details and empirical results for a hybrid message passing and shared memory paralleliziation of the adaptive integral method (AIM). AIM is implemented on a (near) petaflop supercomputing cluster of…

Computational Engineering, Finance, and Science · Computer Science 2010-10-08 Fangzhou Wei , Ali E. Yılmaz

Mixed-precision algorithms have been proposed as a way for scientific computing to benefit from some of the gains seen for artificial intelligence (AI) on recent high performance computing (HPC) platforms. A few applications dominated by…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-16 Aditya Kashi , Nicholson Koukpaizan , Hao Lu , Michael Matheson , Sarp Oral , Feiyi Wang

Kernel matrix-vector product is ubiquitous in many science and engineering applications. However, a naive method requires $O(N^2)$ operations, which becomes prohibitive for large-scale problems. We introduce a parallel method that provably…

Mathematical Software · Computer Science 2021-04-30 Ruoxi Wang , Chao Chen , Jonghyun Lee , Eric Darve

GPUs and other accelerators are increasingly used for scientific computing. In the future, we want to add GPU support to parallel adaptive mesh refinement (AMR) codes written in Fortran. To understand which changes are necessary to obtain…

Recent advances in machine learning force fields (MLFF) have significantly extended the reach of atomistic simulations. Continuous progress in this field requires reliable reference datasets, accurate MLFF architectures, and efficient…

Chemical Physics · Physics 2025-10-24 Tobias Henkes , Shubham Sharma , Alexandre Tkatchenko , Mariana Rossi , Igor Poltavskyi

As fusion energy devices advance, plasma simulations are crucial for reactor design. Our work extends BIT1 hybrid parallelization by integrating MPI with OpenMP and OpenACC, focusing on asynchronous multi-GPU programming. Results show…

Heterogeneous computing is becoming mainstream in all scopes. This new era in computer architecture brings a new paradigm called Accelerator Level Parallelism (ALP). In ALP, accelerators are used concurrently to provide unprecedented levels…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-22 Pablo Antonio Martínez , Gregorio Bernabé , Jose Manuel García

Machine-learned interatomic potentials can offer near first-principles accuracy but are computationally expensive, limiting their application to large-scale molecular dynamics simulations. Inspired by quantum mechanics/molecular mechanics…

Materials Science · Physics 2025-11-21 Fraser Birks , Matthew Nutter , Thomas D Swinburne , James R Kermode

The trend towards highly parallel multi-processing is ubiquitous in all modern computer architectures, ranging from handheld devices to large-scale HPC systems; yet many applications are struggling to fully utilise the multiple levels of…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-07-19 Michael Lange , Gerard Gorman , Michele Weiland , Lawrence Mitchell , Xiaohu Guo , James Southern

Scientific computing in the exascale era demands increased computational power to solve complex problems across various domains. With the rise of heterogeneous computing architectures the need for vendor-agnostic, performance portability…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-05 Johansell Villalobos , Josef Ruzicka , Silvio Rizzi

The Simplex tableau has been broadly used and investigated in the industry and academia. With the advent of the big data era, ever larger problems are posed to be solved in ever larger machines whose architecture type did not exist in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-29 Demetrios Coutinho , Felipe O. Lins e Silva , Daniel Aloise , Samuel , Xavier-de-Souza

Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and…

Machine Learning · Computer Science 2010-06-28 Yucheng Low , Joseph Gonzalez , Aapo Kyrola , Danny Bickson , Carlos Guestrin , Joseph M. Hellerstein
‹ Prev 1 2 3 10 Next ›