English
Related papers

Related papers: PanJoin: A Partition-based Adaptive Stream Join

200 papers

Streaming data join is a critical process in the field of near-real-time data warehousing. For this purpose, an adaptive semi-stream join algorithm called CACHEJOIN (Cache Join) focusing non-uniform stream data is provided in the…

Databases · Computer Science 2019-11-11 M. Asif Naeem , Erum Mehmood , M G Abbas , Noreen Jamil

There is increasing interest in using multicore processors to accelerate stream processing. For example, indexing sliding window content to enhance the performance of streaming queries is greatly improved by utilizing the computational…

Databases · Computer Science 2019-03-04 Amirhesam Shahvarani , Hans-Arno Jacobsen

The availability of large number of processing nodes in a parallel and distributed computing environment enables sophisticated real time processing over high speed data streams, as required by many emerging applications. Sliding window…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-07-26 Abhirup Chakraborty , Ajit Singh

Streaming computing enables the real-time processing of large volumes of data and offers significant advantages for various applications, including real-time recommendations, anomaly detection, and monitoring. The multi-way stream join…

Databases · Computer Science 2024-11-26 Jinlong Hu , Tingfeng Qiu

Improving data systems' performance for join operations has long been an issue of great importance. More recently, a lot of focus has been devoted to multi-way join performance and especially on reducing the negative impact of producing…

Databases · Computer Science 2023-09-01 Qingzhi Ma

Partitioning a graph into balanced blocks such that few edges run between blocks is a key problem for large-scale distributed processing. A current trend for partitioning huge graphs are streaming algorithms, which use low computational…

Data Structures and Algorithms · Computer Science 2022-02-02 Marcelo Fonseca Faraj , Christian Schulz

As an essential operation in data cleaning, the similarity join has attracted considerable attention from the database community. In this paper, we study string similarity joins with edit-distance constraints, which find similar string…

Databases · Computer Science 2011-12-01 Guoliang Li , Dong Deng , Jiannan Wang , Jianhua Feng

In the recent years, the scale of graph datasets has increased to such a degree that a single machine is not capable of efficiently processing large graphs. Thereby, efficient graph partitioning is necessary for those large graph…

Data Structures and Algorithms · Computer Science 2019-02-06 Md Anwarul kaium Patwary , Saurabh Garg , Byeong Kang

Partitioning graphs into blocks of roughly equal size is widely used when processing large graphs. Currently there is a gap in the space of available partitioning algorithms. On the one hand, there are streaming algorithms that have been…

Data Structures and Algorithms · Computer Science 2021-12-23 Marcelo Fonseca Faraj , Christian Schulz

Many well-known, real-world problems involve dynamic data which describe the relationship among the entities. Hypergraphs are powerful combinatorial structures that are frequently used to model such data. For many of today's data-centric…

Data Structures and Algorithms · Computer Science 2021-03-10 Fatih Taşyaran , Berkay Demireller , Kamer Kaya , Bora Uçar

Data streaming relies on continuous queries to process unbounded streams of data in a real-time fashion. It is commonly demanding in computation capacity, given that the relevant applications involve very large volumes of data. Data…

Data Structures and Algorithms · Computer Science 2016-06-16 Vincenzo Gulisano , Yiannis Nikolakopoulos , Daniel Cederman , Marina Papatriantafilou , Philippas Tsigas

Developing high-performance and energy-efficient algorithms for maximum matchings is becoming increasingly important in social network analysis, computational sciences, scheduling, and others. In this work, we propose the first maximum…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-29 Maciej Besta , Marc Fischer , Tal Ben-Nun , Dimitri Stanojevic , Johannes De Fine Licht , Torsten Hoefler

Similarity join--a widely used operation in data science--finds all pairs of items that have distance smaller than a threshold. Prior work has explored distributed computation methods to scale similarity join to large data volumes but these…

Databases · Computer Science 2025-10-13 Yanqi Chen , Xiao Yan , Alexandra Meliou , Eric Lo

We study three-way joins on MapReduce. Joins are very useful in a multitude of applications from data integration and traversing social networks, to mining graphs and automata-based constructions. However, joins are expensive, even for…

Databases · Computer Science 2014-05-19 Ben Kimmett , Alex Thomo , S. Venkatesh

In recent years, the graph partitioning problem gained importance as a mandatory preprocessing step for distributed graph processing on very large graphs. Existing graph partitioning algorithms minimize partitioning latency by assigning…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-05-31 Christian Mayer , Ruben Mayer , Muhammad Adnan Tariq , Heiko Geppert , Larissa Laich , Lukas Rieger , Kurt Rothermel

In todays digital era, data are everywhere from Internet of Things to health care or financial applications. This leads to potentially unbounded ever-growing Big data streams and it needs to be utilized effectively. Data normalization is an…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-24 Vibhuti Gupta , Rattikorn Hewett

Streaming computation plays an important role in large-scale data analysis. The sliding window model is a model of streaming computation which also captures the recency of the data. In this model, data arrives one item at a time, but only…

Data Structures and Algorithms · Computer Science 2021-11-01 Alessandro Epasto , Mohammad Mahdian , Vahab Mirrokni , Peilin Zhong

Streaming analysis is widely used in cloud as well as edge infrastructures. In these contexts, fine-grained application performance can be based on accurate modeling of streaming operators. This is especially beneficial for computationally…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-30 Hannaneh Najdataei , Vincenzo Gulisano , Alessandro V. Papadopoulos , Ivan Walulya , Marina Papatriantafilou , Philippas Tsigas

Graph partitioning plays a vital role in distributedlarge-scale web graph analytics, such as pagerank and labelpropagation. The quality and scalability of partitioning strategyhave a strong impact on such communication- and…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-01-04 Deyu Kong , Xike Xie , Zhuoxu Zhang

Many modern applications require real-time processing of large volumes of high-speed data. Such data processing needs can be modeled as a streaming computation. A streaming computation is specified as a dataflow graph that exposes multiple…

Databases · Computer Science 2018-04-02 Guna Prasaad , G. Ramalingam , Kaushik Rajan
‹ Prev 1 2 3 10 Next ›