Related papers: CppSs -- a C++ Library for Efficient Task Parallel…

Cppless: Single-Source and High-Performance Serverless Programming in C++

The rise of serverless computing introduced a new class of scalable, elastic and widely available parallel workers in the cloud. Many systems and applications benefit from offloading computations and parallel tasks to dynamically allocated…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-23 Marcin Copik , Lukas Möller , Alexandru Calotoiu , Torsten Hoefler

Automatic task-based parallelization of C++ applications by source-to-source transformations

Currently, multi/many-core CPUs are considered standard in most types of computers including, mobile phones, PCs or supercomputers. However, the parallelization of applications as well as refactoring/design of applications for efficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-25 Garip Kusoglu , Berenger Bramas , Stephane Genaud

Specx: a C++ task-based runtime system for heterogeneous distributed architectures

Parallelization is needed everywhere, from laptops and mobile phones to supercomputers. Among parallel programming models, task-based programming has demonstrated a powerful potential and is widely used in high-performance scientific…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-11-18 Paul Cardosi , Bérenger Bramas

Shared memory parallelism in Modern C++ and HPX

Parallel programming remains a daunting challenge, from the struggle to express a parallel algorithm without cluttering the underlying synchronous logic, to describing which devices to employ in a calculation, to correctness. Over the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-10 Patrick Diehl , Steven R. Brandt , Hartmut Kaiser

TaPS: A Performance Evaluation Suite for Task-based Execution Frameworks

Task-based execution frameworks, such as parallel programming libraries, computational workflow systems, and function-as-a-service platforms, enable the composition of distinct tasks into a single, unified application designed to achieve a…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-15 J. Gregory Pauloski , Valerie Hayot-Sasson , Maxime Gonthier , Nathaniel Hudson , Haochen Pan , Sicheng Zhou , Ian Foster , Kyle Chard

VLCs: Managing Parallelism with Virtualized Libraries

As the complexity and scale of modern parallel machines continue to grow, programmers increasingly rely on composition of software libraries to encapsulate and exploit parallelism. However, many libraries are not designed with composition…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-26 Yineng Yan , William Ruys , Hochan Lee , Ian Henriksen , Arthur Peters , Sean Stephens , Bozhi You , Henrique Fingler , Martin Burtscher , Milos Gligoric , Keshav Pingali , Mattan Erez , George Biros , Christopher J. Rossbach

DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorithms

We present DASH, a C++ template library that offers distributed data structures and parallel algorithms and implements a compiler-free PGAS (partitioned global address space) approach. DASH offers many productivity and performance features…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-06 Karl Fürlinger , Tobias Fuchs , Roger Kowalewski

ensmallen: a flexible C++ library for efficient function optimization

We present ensmallen, a fast and flexible C++ library for mathematical optimization of arbitrary user-supplied functions, which can be applied to many machine learning problems. Several types of optimizations are supported, including…

Mathematical Software · Computer Science 2018-12-11 Shikhar Bhardwaj , Ryan R. Curtin , Marcus Edel , Yannis Mentekidis , Conrad Sanderson

Pipeflow: An Efficient Task-Parallel Pipeline Programming Framework using Modern C++

Pipeline is a fundamental parallel programming pattern. Mainstream pipeline programming frameworks count on data abstractions to perform pipeline scheduling. This design is convenient for data-centric pipeline applications but inefficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-02-03 Cheng-Hsiang Chiu , Tsung-Wei Huang , Zizheng Guo , Yibo Lin

Extended Abstract: Productive Parallel Programming with Parsl

Parsl is a parallel programming library for Python that aims to make it easy to specify parallelism in programs and to realize that parallelism on arbitrary parallel and distributed computing systems. Parsl relies on developers annotating…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-05-05 Kyle Chard , Yadu Babuji , Anna Woodard , Ben Clifford , Zhuozhao Li , Mihael Hategan , Ian Foster , Mike Wilde , Daniel S. Katz

Chunks and Tasks: a programming model for parallelization of dynamic algorithms

We propose Chunks and Tasks, a parallel programming model built on abstractions for both data and work. The application programmer specifies how data and work can be split into smaller pieces, chunks and tasks, respectively. The Chunks and…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-07-29 Emanuel H. Rubensson , Elias Rudberg

Concurrent CPU-GPU Task Programming using Modern C++

In this paper, we introduce Heteroflow, a new C++ library to help developers quickly write parallel CPU-GPU programs using task dependency graphs. Heteroflow leverages the power of modern C++ and task-based approaches to enable efficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-17 Tsung-Wei Huang , Yibo Lin

ClangJIT: Enhancing C++ with Just-in-Time Compilation

The C++ programming language is not only a keystone of the high-performance-computing ecosystem but has proven to be a successful base for portable parallel-programming frameworks. As is well known, C++ programmers use templates to…

Programming Languages · Computer Science 2019-04-30 Hal Finkel , David Poliakoff , David F. Richards

QuickSched: Task-based parallelism with dependencies and conflicts

This paper describes QuickSched, a compact and efficient Open-Source C-language library for task-based shared-memory parallel programming. QuickSched extends the standard dependency-only scheme of task-based programming with the concept of…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-01-21 Pedro Gonnet , Aidan B. G. Chalk , Matthieu Schaller

Flexible numerical optimization with ensmallen

This report provides an introduction to the ensmallen numerical optimization library, as well as a deep dive into the technical details of how it works. The library provides a fast and flexible C++ framework for mathematical optimization of…

Mathematical Software · Computer Science 2023-11-16 Ryan R. Curtin , Marcus Edel , Rahul Ganesh Prabhu , Suryoday Basak , Zhihao Lou , Conrad Sanderson

Continuation-Passing C: compiling threads to events through continuations

In this paper, we introduce Continuation Passing C (CPC), a programming language for concurrent systems in which native and cooperative threads are unified and presented to the programmer as a single abstraction. The CPC compiler uses a…

Programming Languages · Computer Science 2012-11-15 Gabriel Kerneis , Juliusz Chroboczek

The Stan Math Library: Reverse-Mode Automatic Differentiation in C++

As computational challenges in optimization and statistical inference grow ever harder, algorithms that utilize derivatives are becoming increasingly more important. The implementation of the derivatives that make these algorithms so…

Mathematical Software · Computer Science 2015-09-25 Bob Carpenter , Matthew D. Hoffman , Marcus Brubaker , Daniel Lee , Peter Li , Michael Betancourt

Extending High-Level Synthesis for Task-Parallel Programs

C/C++/OpenCL-based high-level synthesis (HLS) becomes more and more popular for field-programmable gate array (FPGA) accelerators in many application domains in recent years, thanks to its competitive quality of results (QoR) and short…

Hardware Architecture · Computer Science 2021-05-07 Yuze Chi , Licheng Guo , Jason Lau , Young-kyu Choi , Jie Wang , Jason Cong

A User-Friendly Hybrid Sparse Matrix Class in C++

When implementing functionality which requires sparse matrices, there are numerous storage formats to choose from, each with advantages and disadvantages. To achieve good performance, several formats may need to be used in one program,…

Mathematical Software · Computer Science 2019-10-22 Conrad Sanderson , Ryan Curtin

SplineLib: A Modern Multi-Purpose C++ Spline Library

This paper provides the description of a novel, multi-purpose spline library. In accordance with the increasingly diverse modes of usage of splines, it is multi-purpose in the sense that it supports geometry representation, finite element…

Mathematical Software · Computer Science 2020-02-28 Markus Frings , Norbert Hosters , Corinna Müller , Max Spahn , Christoph Susen , Konstantin Key , Stefanie Elgeti