Related papers: Data processing model for the CDF experiment

Data production of a large Linux PC Farm for the CDF experiment

The data production farm for the CDF experiment is designed and constructed to meet the needs of the Run II data collection at a maximum rate of 20 MByte/sec during the run. The system is composed of a large cluster of personal computers…

High Energy Physics - Experiment · Physics 2007-05-23 J. Antos , M. Babik , A. W. Chan , Y. C. Chen , S. Hou , T. L. Hsieh , R. Lysak , I. V. Mandrichenko , M. Siket , J. Syu , P. K. Teng , S. C. Timm , S. A. Wolbers , P. Yeh

Data production models for the CDF experiment

The data production for the CDF experiment is conducted on a large Linux PC farm designed to meet the needs of data collection at a maximum rate of 40 MByte/sec. We present two data production models that exploits advances in computing and…

Data Analysis, Statistics and Probability · Physics 2007-05-23 J. Antos , M. Babik , D. Benjamin , S. Cabrera , A. W. Chan , Y. C. Chen , M. Coca , B. Cooper , K. Genser , K. Hatakeyama , S. Hou , T. L. Hsieh , B. Jayatilaka , A. C. Kraan , R. Lysak , I. V. Mandrichenko , A. Robson , M. Siket , B. Stelzer , J. Syu , P. K. Teng , S. C. Timm , T. Tomura , E. Vataga , S. A. Wolbers , P. Yeh

Reconfiguration of Distributed Information Fusion System ? A case study

Information Fusion Systems are now widely used in different fusion contexts, like scientific processing, sensor networks, video and image processing. One of the current trends in this area is to cope with distributed systems. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-06-26 Eric Benoit , Marc-Philippe Huget , Patrice Moreaux , Olivier Passalacqua

Review of real-time data processing for collider experiments

We review the status of, and prospects for, real-time data processing for collider experiments in experimental High Energy Physics. We discuss the historical evolution of data rates and volumes in the field and place them in the context of…

High Energy Physics - Experiment · Physics 2023-11-20 V. V. Gligorov , V. Reković

Design and optimisation of an efficient HDF5 I/O kernel for massive parallel fluid flow simulations

More and more massive parallel codes running on several hundreds of thousands of cores enter the computational science and engineering domain, allowing high-fidelity computations on up to trillions of unknowns for very detailed analyses of…

Performance · Computer Science 2018-07-18 Christoph Ertl , Jérôme Frisch , Ralf-Peter Mundani

The CMS Computing System: Successes and Challenges

Each LHC experiment will produce datasets with sizes of order one petabyte per year. All of this data must be stored, processed, transferred, simulated and analyzed, which requires a computing system of a larger scale than ever mounted for…

Instrumentation and Detectors · Physics 2009-10-05 Kenneth Bloom

An adaptive parallel processing strategy in complex event processing systems over data streams

Efficient matching of incoming events of data streams to persistent queries is fundamental to event stream processing systems. These applications require dealing with high volume and continuous data streams with fast processing time on…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-05 Fuyuan Xiao , Masayoshi Aritsugi

The ALICE Run 3 Online / Offline Processing

The ALICE experiment has undergone a major upgrade for LHC Run 3 and will collect data at an interaction rate 50 times larger than before. The new computing scheme for Run 3 replaces the traditionally separate online and offline frameworks…

Instrumentation and Detectors · Physics 2022-08-17 David Rohr

Data processing and online reconstruction

In the upcoming upgrades for Run 3 and 4, the LHC will significantly increase Pb--Pb and pp interaction rates. This goes along with upgrades of all experiments, ALICE, ATLAS, CMS, and LHCb, related to both the detectors and the computing.…

Instrumentation and Detectors · Physics 2018-11-29 David Rohr

Designing dedicated data compression for physics experiments within FPGA already used for data acquisition

Physics experiments produce enormous amount of raw data, counted in petabytes per day. Hence, there is large effort to reduce this amount, mainly by using some filters. The situation can be improved by additionally applying some data…

Information Theory · Computer Science 2015-11-04 Jarek Duda , Grzegorz Korcyl

Extract Dynamic Information To Improve Time Series Modeling: a Case Study with Scientific Workflow

In modeling time series data, we often need to augment the existing data records to increase the modeling accuracy. In this work, we describe a number of techniques to extract dynamic information about the current state of a large…

Machine Learning · Computer Science 2022-05-20 Jeeyung Kim , Mengtian Jin , Youkow Homma , Alex Sim , Wilko Kroeger , Kesheng Wu

Spinning Fast Iterative Data Flows

Parallel dataflow systems are a central part of most analytic pipelines for big data. The iterative nature of many analysis and machine learning algorithms, however, is still a challenge for current systems. While certain types of bulk…

Databases · Computer Science 2012-08-02 Stephan Ewen , Kostas Tzoumas , Moritz Kaufmann , Volker Markl

ALICE data processing for Run 3 and Run 4 at the LHC

During the upcoming Runs 3 and 4 of the LHC, ALICE will take data at a peak Pb-Pb collision rate of 50 kHz. This will be made possible thanks to the upgrade of the main tracking detectors of the experiment, and with a new data processing…

Instrumentation and Detectors · Physics 2020-12-09 Chiara Zampolli

Partitioning Compute Units in CNN Acceleration for Statistical Memory Traffic Shaping

The design complexity of CNNs has been steadily increasing to improve accuracy. To cope with the massive amount of computation needed for such complex CNNs, the latest solutions utilize blocking of an image over the available dimensions and…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-19 Daejin Jung , Sunjung Lee , Wonjong Rhee , Jung Ho Ahn

Data refinement for true concurrency

The majority of modern systems exhibit sophisticated concurrent behaviour, where several system components modify and observe the system state with fine-grained atomicity. Many systems (e.g., multi-core processors, real-time controllers)…

Logic in Computer Science · Computer Science 2013-05-28 Brijesh Dongol , John Derrick

Predictive process mining by network of classifiers and clusterers: the PEDF model

In this research, a model is proposed to learn from event log and predict future events of a system. The proposed PEDF model learns based on events' sequences, durations, and extra features. The PEDF model is built by a network made of…

Machine Learning · Computer Science 2020-11-24 Amir Mohammad Esmaieeli Sikaroudi , Md Habibor Rahman

CFS: A Distributed File System for Large Scale Container Platforms

We propose CFS, a distributed file system for large scale container platforms. CFS supports both sequential and random file accesses with optimized storage for both large files and small files, and adopts different replication protocols for…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-11 Haifeng Liu , Wei Ding , Yuan Chen , Weilong Guo , Shuoran Liu , Tianpeng Li , Mofei Zhang , Jianxing Zhao , Hongyin Zhu , Zhengyi Zhu

Parallelization in Scientific Workflow Management Systems

Over the last two decades, scientific workflow management systems (SWfMS) have emerged as a means to facilitate the design, execution, and monitoring of reusable scientific data processing pipelines. At the same time, the amounts of data…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-03-29 Marc Bux , Ulf Leser

Adapting SAM for CDF

The CDF and D0 experiments probe the high-energy frontier and as they do so have accumulated hundreds of Terabytes of data on the way to petabytes of data over the next two years. The experiments have made a commitment to use the developing…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-01-09 D. Bonham , G. Garzoglio , R. Herber , J. Kowalkowski , D. Litvintsev , L. Lueking , M. Paterno , D. Petravick , L. Piccoli , R. Pordes , N. Stanfield , I. Terekhov , J. Trumbo , J. Tseng , S. Veseli , M. Votava , V. White , T. Huffman , S. Stonjek , K. Waltkins , P. Crosby , D. Waters , R. St. Denis

Data Diffusion: Dynamic Resource Provision and Data-Aware Scheduling for Data Intensive Applications

Data intensive applications often involve the analysis of large datasets that require large amounts of compute and storage resources. While dedicated compute and/or storage farms offer good task/data throughput, they suffer low resource…

Distributed, Parallel, and Cluster Computing · Computer Science 2008-08-27 Ioan Raicu , Yong Zhao , Ian Foster , Alex Szalay