English
Related papers

Related papers: High Performance Dataframes from Parallel Processi…

200 papers

The Data Science domain has expanded monumentally in both research and industry communities during the past decade, predominantly owing to the Big Data revolution. Artificial Intelligence (AI) and Machine Learning (ML) are bringing more…

The data engineering and data science community has embraced the idea of using Python & R dataframes for regular applications. Driven by the big data revolution and artificial intelligence, these applications are now essential in order to…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-20 Niranda Perera , Kaiying Shan , Supun Kamburugamuwe , Thejaka Amila Kanewela , Chathura Widanage , Arup Sarker , Mills Staylor , Tianle Zhong , Vibhatha Abeykoon , Geoffrey Fox

The amazing advances being made in the fields of machine and deep learning are a highlight of the Big Data era for both enterprise and research communities. Modern applications require resources beyond a single node's ability to provide.…

Data pre-processing is a fundamental component in any data-driven application. With the increasing complexity of data processing operations and volume of data, Cylon, a distributed dataframe system, is developed to facilitate data…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-02 Kaiying Shan , Niranda Perera , Damitha Lenadora , Tianle Zhong , Arup Sarker , Supun Kamburugamuve , Thejaka Amila Kanewela , Chathura Widanage , Geoffrey Fox

In the current era of Big Data, data engineering has transformed into an essential field of study across many branches of science. Advancements in Artificial Intelligence (AI) have broadened the scope of data engineering and opened up new…

Dataframes are a popular abstraction to represent, prepare, and analyze data. Despite the remarkable success of dataframe libraries in Rand Python, dataframes face performance issues even on moderately large datasets. Moreover, there is…

Data is found everywhere, from health and human infrastructure to the surge of sensors and the proliferation of internet-connected devices. To meet this challenge, the data engineering field has expanded significantly in recent years in…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-06 Mills Staylor , Arup Kumar Sarker , Gregor von Laszewski , Geoffrey Fox , Yue Cheng , Judy Fox

The parallel and distributed processing are becoming de facto industry standard, and a large part of the current research is targeted on how to make computing scalable and distributed, dynamically, without allocating the resources on…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-10 Rajendra Purohit , K R Chowdhary , S D Purohit

The scaling of large language models (LLMs) is currently bottlenecked by the rigidity of distributed programming. While high-performance libraries like CuBLAS and NCCL provide optimized primitives, they lack the flexibility required for…

Shared memory multiprocessors come back to popularity thanks to rapid spreading of commodity multi-core architectures. As ever, shared memory programs are fairly easy to write and quite hard to optimise; providing multi-core programmers…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-09-10 Marco Aldinucci , Massimo Torquati , Massimiliano Meneghin

Arrival of multicore systems has enforced a new scenario in computing, the parallel and distributed algorithms are fast replacing the older sequential algorithms, with many challenges of these techniques. The distributed algorithms provide…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-13 Rajendra Purohit , K R Chowdhary , S D Purohit

High-performance computing (HPC) applications are increasingly executed in heterogeneous environments, introducing new challenges for programming and software portability. SYCL has emerged as a leading model designed to simplify…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-20 Ami Marowka

Data centers have become center of big data processing. Most programs running in a data center processes big data. The storage requirements of such programs cannot be fulfilled by a single node in the data center, and hence a distributed…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-08-28 Sandeep Kumar

Although recent scaling up approaches to training deep neural networks have proven to be effective, the computational intensity of large and complex models, as well as the availability of large-scale datasets, require deep learning…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-21 Bita Hasheminezhad , Shahrzad Shirzad , Nanmiao Wu , Patrick Diehl , Hannes Schulz , Hartmut Kaiser

Applications are increasingly written as dynamic workflows underpinned by an execution framework that manages asynchronous computations across distributed hardware. However, execution frameworks typically offer one-size-fits-all solutions…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-18 J. Gregory Pauloski , Klaudiusz Rydzy , Valerie Hayot-Sasson , Ian Foster , Kyle Chard

Image- and data-parallel rendering across multiple nodes on high-performance computing systems is widely used in visualization to provide higher frame rates, support large data sets, and render data in situ. Specifically for in situ…

Graphics · Computer Science 2023-05-15 Will Usher , Ingo Wald , Jefferson Amstutz , Johannes Günther , Carson Brownlee , Valerio Pascucci

Many of the most performant deep learning models today in fields like language and image understanding are fine-tuned models that contain billions of parameters. In anticipation of workloads that involve serving many of such large models to…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-27 Daniel Zou , Xinchen Jin , Xueyang Yu , Hao Zhang , James Demmel

Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and…

High level programming languages and GPU accelerators are powerful enablers for a wide range of applications. Achieving scalable vertical (within a compute node), horizontal (across compute nodes), and temporal (over different generations…

While deep learning excels in natural image and language processing, its application to high-dimensional data faces computational challenges due to the dimensionality curse. Current large-scale data tools focus on business-oriented…

Machine Learning · Computer Science 2025-07-01 Chen Zhang
‹ Prev 1 2 3 10 Next ›