Related papers: Global Benchmark Database
The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark…
Big data systems address the challenges of capturing, storing, managing, analyzing, and visualizing big data. Within this context, developing benchmarks to evaluate and compare big data systems has become an active topic for both research…
As large language models (LLMs) continue to advance, the need for up-to-date and well-organized benchmarks becomes increasingly critical. However, many existing datasets are scattered, difficult to manage, and make it challenging to perform…
As optimization challenges continue to evolve, so too must our tools and understanding. To effectively assess, validate, and compare optimization algorithms, it is crucial to use a benchmark test suite that encompasses a diverse range of…
The aim of this article is to present an overview of the major families of state-of-the-art data-base benchmarks, namely: relational benchmarks, object and object-relational benchmarks, XML benchmarks, and decision-support benchmarks, and…
This document describes the generalized moving peaks benchmark (GMPB) and how it can be used to generate problem instances for continuous large-scale dynamic optimization problems. It presents a set of 15 benchmark problems, the relevant…
In machine learning research, it is common to evaluate algorithms via their performance on standard benchmark datasets. While a growing body of work establishes guidelines for -- and levies criticisms at -- data and benchmarking practices…
Graph data management is instrumental for several use cases such as recommendation, root cause analysis, financial fraud detection, and enterprise knowledge representation. Efficiently supporting these use cases yields a number of unique…
Frequentist statistical methods, such as hypothesis testing, are standard practice in papers that provide benchmark comparisons. Unfortunately, these methods have often been misused, e.g., without testing for their statistical test…
Benchmarking, which involves collecting reference datasets and demonstrating method performances, is a requirement for the development of new computational tools, but also becomes a domain of its own to achieve neutral comparisons of…
Dynamic graph learning is crucial for modeling real-world systems with evolving relationships and temporal dynamics. However, the lack of a unified benchmark framework in current research has led to inaccurate evaluations of dynamic graph…
The need for performance measurement tools appeared soon after the emergence of the first Object-Oriented Database Management Systems (OODBMSs), and proved important for both designers and users (Atkinson \& Maier, 1990). Performance…
Benchmarking is crucial for testing and validating any system, even more so in real-time systems. Typical real-time applications adhere to well-understood abstractions: they exhibit a periodic behavior, operate on a well-defined working…
The rapid growth of spatiotemporal data volumes needs to be handled by database systems capable of efficiently managing and querying such data. Existing systems such as PostGIS, SpaceTime, and MobilityDB offer partial solutions but differ…
For scientific software, especially those used for large-scale simulations, achieving good performance and efficiently using the available hardware resources is essential. It is important to regularly perform benchmarks to ensure the…
Free energy calculations are rapidly becoming indispensable in structure-enabled drug discovery programs. As new methods, force fields, and implementations are developed, assessing their expected accuracy on real-world systems…
Benchmarking is an important tool for assessing the relative performance of alternative solving approaches. However, the utility of benchmarking is limited by the quantity and quality of the available problem instances. Modern constraint…
Experimental evaluation is an integral part in the design process of algorithms. Publicly available benchmark instances are widely used to evaluate methods in SAT solving. For the interpretation of results and the design of algorithm…
Empirical and LLM-based research in model-driven engineering increasingly relies on datasets of software models, for instance, to train or evaluate machine learning techniques for modeling support. These datasets have a significant impact…
Given the vast number of classifiers that have been (and continue to be) proposed, reliable methods for comparing them are becoming increasingly important. The desire for reliability is broken down into three main aspects: (1) Comparisons…