English
Related papers

Related papers: Accelerating Python Applications with Dask and Pro…

200 papers

The data engineering and data science community has embraced the idea of using Python & R dataframes for regular applications. Driven by the big data revolution and artificial intelligence, these applications are now essential in order to…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-20 Niranda Perera , Kaiying Shan , Supun Kamburugamuwe , Thejaka Amila Kanewela , Chathura Widanage , Arup Sarker , Mills Staylor , Tianle Zhong , Vibhatha Abeykoon , Geoffrey Fox

Advances in networks, accelerators, and cloud services encourage programmers to reconsider where to compute -- such as when fast networks make it cost-effective to compute on remote accelerators despite added latency. Workflow and…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-31 J. Gregory Pauloski , Valerie Hayot-Sasson , Logan Ward , Nathaniel Hudson , Charlie Sabino , Matt Baughman , Kyle Chard , Ian Foster

Workflow and serverless frameworks have empowered new approaches to distributed application design by abstracting compute resources. However, their typically limited or one-size-fits-all support for advanced data flow patterns leaves…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-03 J. Gregory Pauloski , Valerie Hayot-Sasson , Logan Ward , Alexander Brace , André Bauer , Kyle Chard , Ian Foster

Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and…

In today's world data is being generated at a high rate due to which it has become inevitable to analyze and quickly get results from this data. Most of the relational databases primarily support SQL querying with a limited support for…

Databases · Computer Science 2021-04-08 Alex Watson , Suvam Kumar Das , Suprio Ray

Distributed Asynchronous Object Store (DAOS) is a novel software-defined object store leveraging Non-Volatile Memory (NVM) devices, designed for high performance. It provides a number of interfaces for applications to undertake I/O, ranging…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-30 Nicolau Manubens , Johann Lombardi , Simon D. Smart , Emanuele Danovaro , Tiago Quintino , Dean Hildebrand , Adrian Jackson

Emerging Deep Learning (DL) applications introduce heavy I/O workloads on computer clusters. The inherent long lasting, repeated, and random file access pattern can easily saturate the metadata and data service and negatively impact other…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-10-01 Zhao Zhang , Lei Huang , Uri Manor , Linjing Fang , Gabriele Merlo , Craig Michoski , John Cazes , Niall Gaffney

The data science community today has embraced the concept of Dataframes as the de facto standard for data representation and manipulation. Ease of use, massive operator coverage, and popularization of R and Python languages have heavily…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-07-06 Niranda Perera , Supun Kamburugamuve , Chathura Widanage , Vibhatha Abeykoon , Ahmet Uyar , Kaiying Shan , Hasara Maithree , Damitha Lenadora , Thejaka Amila Kanewala , Geoffrey Fox

Although recent scaling up approaches to training deep neural networks have proven to be effective, the computational intensity of large and complex models, as well as the availability of large-scale datasets, require deep learning…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-21 Bita Hasheminezhad , Shahrzad Shirzad , Nanmiao Wu , Patrick Diehl , Hannes Schulz , Hartmut Kaiser

The amazing advances being made in the fields of machine and deep learning are a highlight of the Big Data era for both enterprise and research communities. Modern applications require resources beyond a single node's ability to provide.…

Python is rapidly becoming the lingua franca of machine learning and scientific computing. With the broad use of frameworks such as Numpy, SciPy, and TensorFlow, scientific computing and machine learning are seeing a productivity boost on…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-01 Zane Fink , Simeng Liu , Jaemin Choi , Matthias Diener , Laxmikant V. Kale

The exponential growth in smart sensors and rapid progress in 5G networks is creating a world awash with data streams. However, a key barrier to building performant multi-sensor, distributed stream processing applications is high…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-10 Giuseppe Coviello , Kunal Rao , Murugan Sankaradas , Srimat Chakradhar

High-level programming languages such as Python are increasingly used to provide intuitive interfaces to libraries written in lower-level languages and for assembling applications from various components. This migration towards…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-21 Yadu Babuji , Anna Woodard , Zhuozhao Li , Daniel S. Katz , Ben Clifford , Rohan Kumar , Lukasz Lacinski , Ryan Chard , Justin M. Wozniak , Ian Foster , Michael Wilde , Kyle Chard

We introduce Diffuse, a system that dynamically performs task and kernel fusion in distributed, task-based runtime systems. The key component of Diffuse is an intermediate representation of distributed computation that enables the necessary…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-17 Rohan Yadav , Shiv Sundram , Wonchan Lee , Michael Garland , Michael Bauer , Alex Aiken , Fredrik Kjolstad

More and more distributed software systems are being developed and deployed today. Like other software, distributed software systems also need very strong quality assurance support. Distributed software is often very large/complex, has…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-03-08 Xiaoqin Fu

Dask is a popular parallel and distributed computing framework, which rivals Apache Spark to enable task-based scalable processing of big data. The Dask Distributed library forms the basis of this computing engine and provides support for…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-25 Aamir Shafi , Jahanzeb Maqbool Hashmi , Hari Subramoni , Dhabaleswar K. Panda

Dask is a distributed task framework which is commonly used by data scientists to parallelize Python code on computing clusters with little programming effort. It uses a sophisticated work-stealing scheduler which has been hand-tuned to…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-21 Stanislav Böhm , Jakub Beránek

One of the factors that limits the scale, performance, and sophistication of distributed applications is the difficulty of concurrently executing them on multiple distributed computing resources. In part, this is due to a poor understanding…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-02-22 Matteo Turilli , Feng Liu , Zhao Zhang , Andre Merzky , Michael Wilde , Jon Weissman , Daniel S. Katz , Shantenu Jha

Almost all applications stop scaling at some point; those that don't are seldom performant when considering time to solution on anything but aspirational/unicorn resources. Recognizing these tradeoffs as well as greater user functionality…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-28 Stephen Hudson , Jeffrey Larson , John-Luke Navarro , Stefan M. Wild

Access transparency means that both local and remote resources are accessed using identical operations. With transparency, unmodified single-machine applications could run over disaggregated compute, storage, and memory resources. Hiding…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-11-23 Aitor Arjona , Gerard Finol , Pedro Garcia-Lopez
‹ Prev 1 2 3 10 Next ›