English
Related papers

Related papers: Efficiently Reproducing Distributed Workflows in N…

200 papers

Computational notebooks are notoriously prone to reproducibility failures. By permitting out-of-order cell execution, notebooks accumulate hidden state and implicit dependencies that cause interactive executions to silently diverge from…

Programming Languages · Computer Science 2026-05-05 Stephen N. Freund , Emery D. Berger , Cormac Flanagan , Eunice Jun

Reproducibility of computational studies is a hallmark of scientific methodology. It enables researchers to build with confidence on the methods and findings of others, reuse and extend computational pipelines, and thereby drive scientific…

Jupyter notebooks facilitate the bundling of executable code with its documentation and output in one interactive environment, and they represent a popular mechanism to document and share computational workflows. The reproducibility of…

Digital Libraries · Computer Science 2023-08-16 Sheeba Samuel , Daniel Mietchen

Computational notebooks have emerged as the platform of choice for data science and analytical workflows, enabling rapid iteration and exploration. By keeping intermediate program state in memory and segmenting units of execution into…

Software Engineering · Computer Science 2021-06-22 Stephen Macke , Hongpu Gong , Doris Jung-Lin Lee , Andrew Head , Doris Xin , Aditya Parameswaran

Computational reproducibility is fundamental to trustworthy science, yet remains difficult to achieve in practice across various research workflows, including Jupyter notebooks published alongside scholarly articles. Environment drift,…

Software Engineering · Computer Science 2026-04-02 Sheeba Samuel , Daniel Mietchen , Hemanta Lo , Martin Gaedke

Jupyter notebooks represent a unique format for programming - a combination of code and Markdown with rich formatting, separated into individual cells. We propose to perceive a Jupyter Notebook cell as a simplified and raw version of a…

Software Engineering · Computer Science 2022-01-03 Sergey Titov , Yaroslav Golubev , Timofey Bryksin

Saving, or checkpointing, intermediate results during interactive data exploration can potentially boost user productivity. However, existing studies on this topic are limited, as they primarily rely on small-scale experiments with human…

Human-Computer Interaction · Computer Science 2025-04-03 Hanxi Fang , Supawit Chockchowwat , Hari Sundaram , Yongjoo Park

Scientific workflows are pipelines of interdependent tasks. They are increasingly executed on shared Kubernetes clusters via workflow engines such as Nextflow. Their energy consumption matters for both cost and sustainability. It is…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-22 Philipp Thamm , Somayeh Mohammadi , Kathleen West , Knut Reinert , Lauritz Thamsen , Ulf Leser

The rising popularity of computational workflows is driven by the need for repetitive and scalable data processing, sharing of processing know-how, and transparent methods. As both combined records of analysis and descriptions of processing…

Using multiple nodes and parallel computing algorithms has become a principal tool to improve training and execution times of deep neural networks as well as effective collective intelligence in sensor networks. In this paper, we consider…

Machine Learning · Computer Science 2020-08-20 Afshin Abdi , Saeed Rashidi , Faramarz Fekri , Tushar Krishna

Motivation: The rapid growth of biological data has intensified the need for transparent, reproducible, and well-documented computational workflows. The ability to clearly connect the steps of a workflow in the code with their description…

Computation and Language · Computer Science 2026-03-10 Clémence Sebe , Olivier Ferret , Aurélie Névéol , Mahdi Esmailoghli , Ulf Leser , Sarah Cohen-Boulakia

Scientific workflows have been predominantly used for complex and large scale data analysis and scientific computation/automation and the need for robust workflow scheduling techniques has grown considerably. But, most of the existing…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-04 S. Jaya Nirmala , Amrith Rajagopal Setlur , Har Simrat Singh , Sudhanshu Khoriya

Computational reproducibility refers to obtaining consistent results when rerunning an experiment. Jupyter Notebook, a web-based computational notebook application, facilitates running, publishing, and sharing computational experiments…

Software Engineering · Computer Science 2025-09-30 A S M Shahadat Hossain , Colin Brown , David Koop , Tanu Malik

We consider running-time optimization for band-joins in a distributed system, e.g., the cloud. To balance load across worker machines, input has to be partitioned, which causes duplication. We explore how to resolve this tension between…

Databases · Computer Science 2020-04-15 Rundong Li , Wolfgang Gatterbauer , Mirek Riedewald

Scientific workflows facilitate computational, data manipulation, and sometimes visualization steps for scientific data analysis. They are vital for reproducing and validating experiments, usually involving computational steps in scientific…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-22 Jinli Duan , Shasha Dennis

Jupyter Notebooks are an enormously popular tool for creating and narrating computational research projects. They also have enormous potential for creating reproducible scientific research artifacts. Capturing the complete state of a…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-18 Dimuthu Wannipurage , Suresh Marru , Marlon Pierce

In a new effort to make our research transparent and reproducible by others, we developed a workflow to run and share computational studies on the public cloud Microsoft Azure. It uses Docker containers to create an image of the application…

Computational Engineering, Finance, and Science · Computer Science 2020-07-24 Olivier Mesnard , Lorena A. Barba

High Performance Computing is notorious for its long and expensive software development cycle. To address this challenge, we present Bind: a "partitioned global workflow" parallel programming model for C++ applications that enables quick…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-16 Alex Kosenkov , Matthias Troyer

As Deep Neural Networks (DNNs) have become an increasingly ubiquitous workload, the range of libraries and tooling available to aid in their development and deployment has grown significantly. Scalable, production quality tools are freely…

Machine Learning · Computer Science 2022-06-22 Perry Gibson , José Cano

Reproducing executions of multithreaded programs is very challenging due to many intrinsic and external non-deterministic factors. Existing RnR systems achieve significant progress in terms of performance overhead, but none targets the…

Operating Systems · Computer Science 2018-04-05 Hongyu Liu , Sam Silvestro , Wei Wang , Chen Tian , Tongping Liu
‹ Prev 1 2 3 10 Next ›