pPython for Parallel Python Programming
Abstract
pPython seeks to provide a parallel capability that provides good speed-up without sacrificing the ease of programming in Python by implementing partitioned global array semantics (PGAS) on top of a simple file-based messaging library (PythonMPI) in pure Python. The core data structure in pPython is a distributed numerical array whose distribution onto multiple processors is specified with a map construct. Communication operations between distributed arrays are abstracted away from the user and pPython transparently supports redistribution between any block-cyclic-overlapped distributions in up to four dimensions. pPython follows a SPMD (single program multiple data) model of computation. pPython runs on any combination of heterogeneous systems that support Python, including Windows, Linux, and MacOS operating systems. In addition to running transparently on single-node (e.g., a laptop), pPython provides a scheduler interface, so that pPython can be executed in a massively parallel computing environment. The initial implementation uses the Slurm scheduler. Performance of pPython on the HPC Challenge benchmark suite demonstrates both ease of programming and scalability.
Cite
@article{arxiv.2208.14908,
title = {pPython for Parallel Python Programming},
author = {Chansup Byun and William Arcand and David Bestor and Bill Bergeron and Vijay Gadepally and Michael Houle and Matthew Hubbell and Hayden Jananthan and Michael Jones and Kurt Keville and Anna Klein and Peter Michaleas and Lauren Milechin and Guillermo Morales and Julie Mullen and Andrew Prout and Albert Reuther and Antonio Rosa and Siddharth Samsi and Charles Yee and Jeremy Kepner},
journal= {arXiv preprint arXiv:2208.14908},
year = {2022}
}
Comments
arXiv admin note: substantial text overlap with arXiv:astro-ph/0606464