English
Related papers

Related papers: Data Management System Analysis for Distributed Co…

200 papers

The Production and Distributed Analysis (PanDA) system, originally developed for the ATLAS experiment at the CERN Large Hadron Collider (LHC), has evolved into a robust platform for orchestrating large-scale workflows across distributed…

Monitoring of the large-scale data processing of the ATLAS experiment includes monitoring of production and user analysis jobs. The Experiment Dashboard provides a common job monitoring solution, which is shared by ATLAS and CMS…

Instrumentation and Detectors · Physics 2019-05-31 L Sargsyan , J Andreeva , S Campana , E Karavakis , L Kokoszkiewicz , P Saiz , J Schovancova , D Tuckett

The intelligent Distributed Dispatch and Scheduling (iDDS) service is a versatile workflow orchestration system designed for large-scale, distributed scientific computing. iDDS extends traditional workload and data management by integrating…

Asynchronous methods are fundamental for parallelizing computations in distributed machine learning. They aim to accelerate training by fully utilizing all available resources. However, their greedy approach can lead to inefficiencies using…

Machine Learning · Computer Science 2025-05-23 Artavazd Maranjyan , El Mehdi Saad , Peter Richtárik , Francesco Orabona

Distributed dataflow systems like Apache Flink and Apache Spark simplify processing large amounts of data on clusters in a data-parallel manner. However, choosing suitable cluster resources for distributed dataflow jobs in both type and…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-14 Jonathan Will , Onur Arslan , Jonathan Bader , Dominik Scheinert , Lauritz Thamsen

Many organizations routinely analyze large datasets using systems for distributed data-parallel processing and clusters of commodity resources. Yet, users need to configure adequate resources for their data processing jobs. This requires…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-02 Lauritz Thamsen , Dominik Scheinert , Jonathan Will , Jonathan Bader , Odej Kao

Analyzing large datasets with distributed dataflow systems requires the use of clusters. Public cloud providers offer a large variety and quantity of resources that can be used for such clusters. However, picking the appropriate resources…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-28 Jonathan Will , Jonathan Bader , Lauritz Thamsen

Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The data can be distributed across heterogeneous data centers at widely…

Large-scale scientific collaborations like ATLAS, Belle II, CMS, DUNE, and others involve hundreds of research institutes and thousands of researchers spread across the globe. These experiments generate petabytes of data, with volumes soon…

Distributed dataflow systems enable data-parallel processing of large datasets on clusters. Public cloud providers offer a large variety and quantity of resources that can be used for such clusters. Yet, selecting appropriate cloud…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-03 Jonathan Will , Lauritz Thamsen , Dominik Scheinert , Jonathan Bader , Odej Kao

The dynamic nature of resource allocation and runtime conditions on Cloud can result in high variability in a job's runtime across multiple iterations, leading to a poor experience. Identifying the sources of such variation and being able…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-10 Yiwen Zhu , Rathijit Sen , Robert Horton , John Mark , Agosta

Contemporary Distributed Computing Systems (DCS) such as Cloud Data Centres are large scale, complex, heterogeneous, and distributed across multiple networks and geographical boundaries. On the other hand, the Internet of Things…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-10 Shashikant Ilager , Rajeev Muralidhar , Rajkumar Buyya

Emerging smart grid applications analyze large amounts of data collected from millions of meters and systems to facilitate distributed monitoring and real-time control tasks. However, current parallel data processing systems are designed…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-03 Binquan Guo , Hongyan Li , Ye Yan , Zhou Zhang , Peng Wang

The integration of large language models (LLMs) with external tools has significantly expanded the capabilities of AI agents. However, as the diversity of both LLMs and tools increases, selecting the optimal model-tool combination becomes a…

Computation and Language · Computer Science 2026-01-08 Jinyang Wu , Guocheng Zhai , Ruihan Jin , Jiahao Yuan , Yuhao Shen , Shuai Zhang , Zhengqi Wen , Jianhua Tao

Distributed resource allocation (DRA) is fundamental to modern networked systems, spanning applications from economic dispatch in smart grids to CPU scheduling in data centers. Conventional DRA approaches require reliable communication, yet…

Systems and Control · Electrical Eng. & Systems 2025-10-22 Mohammadreza Doostmohammadian , Sergio Pequito

In the past decade, increasingly network scheduling techniques have been proposed to boost the distributed application performance. Flow-level metrics, such as flow completion time (FCT), are based on the abstraction of flows yet they…

Networking and Internet Architecture · Computer Science 2019-01-18 Jiawei Fei , Yang Shi , Qun Huang , Mei Wen

According to the pay-per-use model adopted in clouds, the more the resources consumed by an application running in a cloud computing environment, the greater the amount of money the owner of the corresponding application will be charged.…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-06-28 Nikos Tziritas , Samee Ullah Khan , Cheng-Zhong Xu , Jue Hong

Geo-distributed analytics (GDA) frameworks transfer large datasets over the wide-area network (WAN). Yet existing frameworks often ignore the WAN topology. This disconnect between WAN-bound applications and the WAN itself results in missed…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-04-19 Jie You , Mosharaf Chowdhury

ATLAS event data processing requires access to non-event data (detector conditions, calibrations, etc.) stored in relational databases. The database-resident data are crucial for the event data reconstruction processing steps and often…

Instrumentation and Detectors · Physics 2019-08-13 A. Vaniachine

Modern data workflows are inherently adaptive, repeatedly querying the same dataset to refine and validate sequential decisions, but such adaptivity can lead to overfitting and invalid statistical inference. Adaptive Data Analysis (ADA)…

Machine Learning · Computer Science 2026-02-10 Joon Suk Huh
‹ Prev 1 2 3 10 Next ›