English
Related papers

Related papers: Better Write Amplification for Streaming Data Proc…

200 papers

Some of the most relevant document schemas used online, such as XML and JSON, have a nested format. In the last decade, the task of extracting data from nested documents over streams has become especially relevant. We focus on the streaming…

Databases · Computer Science 2022-01-11 Martín Muñoz , Cristian Riveros

The amount of textual data has reached a new scale and continues to grow at an unprecedented rate. IBM's SystemT software is a powerful text analytics system, which offers a query-based interface to reveal the valuable information that lies…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-05 Raphael Polig , Kubilay Atasu , Laura Chiticariu , Christoph Hagleitner , H. Peter Hofstee , Frederick R. Reiss , Eva Sitaridi , Huaiyu Zhu

Data streaming relies on continuous queries to process unbounded streams of data in a real-time fashion. It is commonly demanding in computation capacity, given that the relevant applications involve very large volumes of data. Data…

Data Structures and Algorithms · Computer Science 2016-06-16 Vincenzo Gulisano , Yiannis Nikolakopoulos , Daniel Cederman , Marina Papatriantafilou , Philippas Tsigas

Distributed stream processing systems are widely deployed to process real-time data generated by various devices, such as sensors and software systems. A key challenge in the system is overloading, which leads to an unstable system status…

Networking and Internet Architecture · Computer Science 2025-06-16 Ziren Xiao

There has been great progress in improving streaming machine translation, a simultaneous paradigm where the system appends to a growing hypothesis as more source content becomes available. We study a related problem in which revisions to…

Computation and Language · Computer Science 2020-07-01 Naveen Arivazhagan , Colin Cherry , Wolfgang Macherey , George Foster

Stream applications are widely deployed on the cloud. While modern distributed streaming systems like Flink and Spark Streaming can schedule and execute them efficiently, streaming dataflows are often dynamically changing, which may cause…

Systems and Control · Electrical Eng. & Systems 2021-03-17 Pengqi Lu , Liang Yuan , Yunquan Zhang , Hang Cao , Kun Li

While ML model training and inference are both GPU-intensive, CPU-based data processing is often the bottleneck. Distributed data processing systems based on the batch or stream processing models assume homogeneous resource requirements.…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-23 Frank Sifei Luan , Ron Yifeng Wang , Yile Gu , Ziming Mao , Charlotte Lin , Amog Kamsetty , Hao Chen , Cheng Su , Balaji Veeramani , Scott Lee , SangBin Cho , Clark Zinzow , Eric Liang , Ion Stoica , Stephanie Wang

Stream processing has become a critical component in the architecture of modern applications. With the exponential growth of data generation from sources such as the Internet of Things, business intelligence, and telecommunications,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-27 Dominik Scheinert , Fabian Casares , Morgan K. Geldenhuys , Kevin Styp-Rekowski , Odej Kao

Several high-throughput distributed data-processing applications require multi-hop processing of streams of data. These applications include continual processing on data streams originating from a network of sensors, composing a multimedia…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-03-26 Shah Asaduzzaman , Muthucumaru Maheswaran

Many modern applications require real-time processing of large volumes of high-speed data. Such data processing needs can be modeled as a streaming computation. A streaming computation is specified as a dataflow graph that exposes multiple…

Databases · Computer Science 2018-04-02 Guna Prasaad , G. Ramalingam , Kaushik Rajan

MapReduce is a commonly used framework for executing data-intensive jobs on distributed server clusters. We introduce a variant implementation of MapReduce, namely "Coded MapReduce", to substantially reduce the inter-server communication…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-12-08 Songze Li , Mohammad Ali Maddah-Ali , A. Salman Avestimehr

Real-time Big Data architectures evolved into specialized layers for handling data streams' ingestion, storage, and processing over the past decade. Layered streaming architectures integrate pull-based read and push-based write RPC…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-11-14 Ovidiu-Cristian Marcu , Pascal Bouvry

The exponential growth in smart sensors and rapid progress in 5G networks is creating a world awash with data streams. However, a key barrier to building performant multi-sensor, distributed stream processing applications is high…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-10 Giuseppe Coviello , Kunal Rao , Murugan Sankaradas , Srimat Chakradhar

An increasing number of scientific applications rely on stream processing for generating timely insights from data feeds of scientific instruments, simulations, and Internet-of-Thing (IoT) sensors. The development of streaming applications…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-13 Andre Luckow , George Chantzialexiou , Shantenu Jha

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a…

Databases · Computer Science 2013-02-14 Sherif Sakr , Anna Liu , Ayman G. Fayoumi

Tracking and approximating data matrices in streaming fashion is a fundamental challenge. The problem requires more care and attention when data comes from multiple distributed sites, each receiving a stream of data. This paper considers…

Databases · Computer Science 2014-05-01 Mina Ghashami , Jeff M. Phillips , Feifei Li

Big data problems frequently require processing datasets in a streaming fashion, either because all data are available at once but collectively are larger than available memory or because the data intrinsically arrive one data point at a…

Computation · Statistics 2018-08-08 Andrea Giovannucci , Victor Minden , Cengiz Pehlevan , Dmitri B. Chklovskii

Emerging applications of machine learning in numerous areas involve continuous gathering of and learning from streams of data. Real-time incorporation of streaming data into the learned models is essential for improved inference in these…

Machine Learning · Computer Science 2020-12-01 Matthew Nokleby , Haroon Raja , Waheed U. Bajwa

Stream processing is a computing paradigm that supports real-time data processing for a wide variety of applications. At Meta, it's used across the company for various tasks such as deriving product insights, providing and improving user…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-09 Animesh Dangwal , Yufeng Jiang , Charlie Arnold , Jun Fan , Mohamed Bassem , Aish Rajagopal

Ever-increasing amounts of data and requirements to process them in real time lead to more and more analytics platforms and software systems being designed according to the concept of stream processing. A common area of application is the…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-05 Sören Henning , Wilhelm Hasselbring
‹ Prev 1 2 3 10 Next ›