English
Related papers

Related papers: On Big Data Benchmarking

200 papers

The great prosperity of big data systems such as Hadoop in recent years makes the benchmarking of these systems become crucial for both research and industry communities. The complexity, diversity, and rapid evolution of big data systems…

Performance · Computer Science 2015-06-05 Rui Han , Zhen Jia , Wanling Gao , Xinhui Tian , Lei Wang

Now we live in an era of big data, and big data applications are becoming more and more pervasive. How to benchmark data center computer systems running big data applications (in short big data systems) is a hot topic. In this paper, we…

Performance · Computer Science 2013-07-31 Zhen Jia , Runlin Zhou , Chunge Zhu , Lei Wang , Wanling Gao , Yingjie Shi , Jianfeng Zhan , Lixin Zhang

The development of scalable, representative, and widely adopted benchmarks for graph data systems have been a question for which answers has been sought for decades. We conduct an in-depth study of the existing literature on benchmarks for…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-23 Miyuru Dayarathna , Toyotaro Suzumura

The continuous increase of data generated provides enormous possibilities of both public and private companies. The management of this mass of data or big data will play a crucial role in the society of the future, as it finds applications…

Computers and Society · Computer Science 2015-01-15 Fatima El Jamiy , Abderrahmane Daif , Mohamed Azouazi , Abdelaziz Marzak

One of the most significant problems of Big Data is to extract knowledge through the huge amount of data. The usefulness of the extracted information depends strongly on data quality. In addition to the importance, data quality has recently…

Databases · Computer Science 2020-05-25 Mostafa Mirzaie , Behshid Behkamal , Samad Paydar

Data generation is a key issue in big data benchmarking that aims to generate application-specific data sets to meet the 4V requirements of big data. Specifically, big data generators need to generate scalable data (Volume) of different…

Databases · Computer Science 2014-02-28 Zijian Ming , Chunjie Luo , Wanling Gao , Rui Han , Qiang Yang , Lei Wang , Jianfeng Zhan

Recently, increasingly large amounts of data are generated from a variety of sources. Existing data processing technologies are not suitable to cope with the huge amounts of generated data. Yet, many research works focus on Big Data, a…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-07 Wissem Inoubli , Sabeur Aridhi , Haithem Mezni , Mondher Maddouri , Engelbert Mephu Nguifo

Big data benchmark suites must include a diversity of data and workloads to be useful in fairly evaluating big data systems and architectures. However, using truly comprehensive benchmarks poses great challenges for the architecture…

Performance · Computer Science 2016-11-15 Zhen Jia , Jianfeng Zhan , Lei Wang , Rui Han , Sally A. McKee , Qiang Yang , Chunjie Luo , Jingwei Li

Big Data is considered proprietary asset of companies, organizations, and even nations. Turning big data into real treasure requires the support of big data systems. A variety of commercial and open source products have been unleashed for…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-12-29 Yuqing Zhu , Jianfeng Zhan , Chuliang Weng , Raghunath Nambiar , Jinchao Zhang , Xingzhen Chen , Lei Wang

The design and construction of high performance computing (HPC) systems relies on exhaustive performance analysis and benchmarking. Traditionally this activity has been geared exclusively towards simulation scientists, who, unsurprisingly,…

Performance · Computer Science 2018-11-07 Drew Schmidt , Junqi Yin , Michael Matheson , Bronson Messer , Mallikarjun Shankar

As architecture, systems, and data management communities pay greater attention to innovative big data systems and architectures, the pressure of benchmarking and evaluating these systems rises. Considering the broad use of big data…

The rise of big data systems has created a need for benchmarks to measure and compare the capabilities of these systems. Big data benchmarks present unique scalability challenges. The supercomputing community has wrestled with these…

Performance · Computer Science 2016-12-13 Patrick Dreher , Chansup Byun , Chris Hill , Vijay Gadepally , Bradley Kuszmaul , Jeremy Kepner

The amount of data in the world is expanding rapidly. Every day, huge amounts of data are created by scientific experiments, companies, and end users' activities. These large data sets have been labeled as "Big Data", and their storage,…

Databases · Computer Science 2020-04-29 Mahdi Bohlouli , Frank Schulz , Lefteris Angelis , David Pahor , Ivona Brandic , David Atlan , Rosemary Tate

The aim of this article is to present an overview of the major families of state-of-the-art data processing benchmarks, namely transaction processing benchmarks and decision support benchmarks. We also address the newer trends in cloud…

Databases · Computer Science 2017-01-31 Jérôme Darmont

Data lakes have emerged as a flexible and scalable solution for storing and analyzing large volumes of heterogeneous data, including structured, semi-structured, and unstructured formats. Despite their growing adoption in both industry and…

Databases · Computer Science 2026-01-28 Yi Lyu , Pei-Chieh Lo , Natan Lidukhover

Benchmarking, which involves collecting reference datasets and demonstrating method performances, is a requirement for the development of new computational tools, but also becomes a domain of its own to achieve neutral comparisons of…

Other Quantitative Biology · Quantitative Biology 2025-07-24 Izaskun Mallona , Charlotte Soneson , Ben Carrillo , Almut Luetge , Daniel Incicau , Reto Gerber , Anthony Sonrel , Mark D. Robinson

Big Data is reforming many industrial domains by providing decision support through analyzing large data volumes. Big Data testing aims to ensure that Big Data systems run smoothly and error-free while maintaining the performance and…

Artificial Intelligence · Computer Science 2022-07-15 Iram Arshad , Saeed Hamood Alsamhi , Wasif Afzal

Big data present new opportunities for modern society while posing challenges for data scientists. Recent advancements in sensor networks and the widespread adoption of IoT have led to the collection of physical-sensor data on an enormous…

Information Retrieval · Computer Science 2025-01-23 Zhipeng Ma , Bo Nørregaard Jørgensen , Zheng Grace Ma

With the advent of big data applications and the increasing amount of data being produced in these applications, the importance of efficient methods for big data analysis has become highly evident. However, the success of any such method…

Computers and Society · Computer Science 2019-11-05 Mostafa Mirzaie , Behshid Behkamal , Samad Paydar

The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark…

Machine Learning · Computer Science 2017-03-03 Randal S. Olson , William La Cava , Patryk Orzechowski , Ryan J. Urbanowicz , Jason H. Moore
‹ Prev 1 2 3 10 Next ›