English
Related papers

Related papers: Efficient Multi-way Theta-Join Processing Using Ma…

200 papers

We study three-way joins on MapReduce. Joins are very useful in a multitude of applications from data integration and traversing social networks, to mining graphs and automata-based constructions. However, joins are expensive, even for…

Databases · Computer Science 2014-05-19 Ben Kimmett , Alex Thomo , S. Venkatesh

In this work, we study the problem of co-optimize communication, pre-computing, and computation cost in one-round multi-way join evaluation. We propose a multi-way join approach ADJ (Adaptive Distributed Join) for complex join which finds…

Databases · Computer Science 2021-03-01 Hao Zhang , Miao Qiao , Jeffrey Xu Yu , Hong Cheng

As database query processing techniques are being used to handle diverse workloads, a key emerging challenge is how to efficiently handle multi-way join queries containing multiple many-to-many joins. While uncommon in traditional…

Databases · Computer Science 2025-05-20 Hasara Kalumin , Amol Deshpande

Handling skew is one of the major challenges in query processing. In distributed computational environments such as MapReduce, uneven distribution of the data to the servers is not desired. One of the dominant measures that we want to…

Databases · Computer Science 2015-04-14 Foto N. Afrati , Jeffrey D. Ullman , Angelos Vasilakopoulos

The growing data has brought tremendous pressure for query processing and storage, so there are many studies that focus on using GPU to accelerate join operation, which is one of the most important operations in modern database systems.…

Databases · Computer Science 2019-04-26 Hongzhi Wang , Ning Li , Zheng Wang , Jianing Li

It is crucial to provide real-time performance in many applications, such as interactive and exploratory data analysis. In these settings, users often need to view subsets of query results quickly. It is challenging to deliver such results…

The advancement of mobile technologies and the proliferation of map-based applications have enabled a user to access a wide variety of services that range from information queries to navigation systems. Due to the popularity of map-based…

Databases · Computer Science 2020-04-24 Hossain Mahmud , Ashfaq Mahmood Amin , Mohammed Eunus Ali , Tanzima Hashem

In the last few years, much effort has been devoted to developing join algorithms in order to achieve worst-case optimality for join queries over relational databases. Towards this end, the database community has had considerable success in…

Databases · Computer Science 2020-03-02 Shaleen Deep , Xiao Hu , Paraschos Koutris

Selecting appropriate distributed join methods for logical join operations in a query plan is crucial for the performance of data-intensive scalable computing (DISC). Different network communication patterns in the data exchange phase…

Databases · Computer Science 2023-12-29 F. Liang , F. C. M. Lau , H. Cui , Y. Li , B. Lin , C. Li , X. Hu

Streaming computing enables the real-time processing of large volumes of data and offers significant advantages for various applications, including real-time recommendations, anomaly detection, and monitoring. The multi-way stream join…

Databases · Computer Science 2024-11-26 Jinlong Hu , Tingfeng Qiu

As RDF becomes more widely established and the amount of linked data is rapidly increasing, the efficient querying of large amount of data becomes a significant challenge. In this paper, we propose a family of algorithms for querying large…

Databases · Computer Science 2022-09-13 Eleftherios Kalogeros , Manolis Gergatsoulis , Matthew Damigos , Christos Nomikos

As one of the most useful online processing techniques, the theta-join operation has been utilized by many applications to fully excavate the relationships between data streams in various scenarios. As such, constant research efforts have…

Data Structures and Algorithms · Computer Science 2022-08-08 Jiashu Wu , Yang Wang , Xiaopeng Fan , Kejiang Ye , Chengzhong Xu

MapReduce has proven to be one of the most useful paradigms in the revolution of distributed computing, where cloud services and cluster computing become the standard venue for computing. The federation of cloud and big data activities is…

Databases · Computer Science 2016-07-29 Foto Afrati , Shlomi Dolev , Shantanu Sharma , Jeffrey D. Ullman

Joining trajectory datasets is a significant operation in mobility data analytics and the cornerstone of various methods that aim to extract knowledge out of them. In the era of Big Data, the production of mobility data has become massive…

Databases · Computer Science 2020-02-07 Panagiotis Tampakis , Christos Doulkeridis , Nikos Pelekis , Yannis Theodoridis

Recently, MapReduce based spatial query systems have emerged as a cost effective and scalable solution to large scale spatial data processing and analytics. MapReduce based systems achieve massive scalability by partitioning the data and…

Databases · Computer Science 2015-09-04 Ablimit Aji , Vo Hoang , Fusheng Wang

While services such as Amazon AWS make computing power abundantly available, adding more computing nodes can incur high costs in, for instance, pay-as-you-go plans while not always significantly improving the net running time (aka…

Databases · Computer Science 2016-05-24 Jonny Daenen , Frank Neven , Tony Tan , Stijn Vansummeren

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a…

Databases · Computer Science 2013-02-14 Sherif Sakr , Anna Liu , Ayman G. Fayoumi

In modern large-scale distributed systems, analytics jobs submitted by various users often share similar work, for example scanning and processing the same subset of data. Instead of optimizing jobs independently, which may result in…

Databases · Computer Science 2018-05-23 Pietro Michiardi , Damiano Carra , Sara Migliorini

To process a large volume of data, modern data management systems use a collection of machines connected through a network. This paper looks into the feasibility of scaling up such a shared-nothing system while processing a compute- and…

Databases · Computer Science 2018-04-26 Abhirup Chakraborty

Modern data analytical workloads often need to run queries over a large number of tables. An optimal query plan for such queries is crucial for being able to run these queries within acceptable time bounds. However, with queries involving…

Databases · Computer Science 2022-03-02 Riccardo Mancini , Srinivas Karthik , Bikash Chandra , Vasilis Mageirakos , Anastasia Ailamaki
‹ Prev 1 2 3 10 Next ›