Related papers: Benchmarking Simulation-Based Inference
Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning…
Graphical models are widely used to study complex multivariate biological systems. Network inference algorithms aim to reverse-engineer such models from noisy experimental data. It is common to assess such algorithms using techniques from…
Fair algorithm evaluation is conditioned on the existence of high-quality benchmark datasets that are non-redundant and are representative of typical optimization scenarios. In this paper, we evaluate three heuristics for selecting diverse…
Molecular machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about molecular…
The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark…
Evaluating performance across optimization algorithms on many problems presents a complex challenge due to the diversity of numerical scales involved. Traditional data processing methods, such as hypothesis testing and Bayesian inference,…
In the rapidly evolving domain of Recommender Systems (RecSys), new algorithms frequently claim state-of-the-art performance based on evaluations over a limited set of arbitrarily selected datasets. However, this approach may fail to…
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation such as data sampling, data augmentation, parameter…
Recent advancements in ultra-low-power machine learning (TinyML) hardware promises to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted benchmark for these systems.…
A central challenge in many areas of science and engineering is to identify model parameters that are consistent with prior knowledge and empirical data. Bayesian inference offers a principled framework for this task, but can be…
Benchmarking plays an important role in the development of novel search algorithms as well as for the assessment and comparison of contemporary algorithmic ideas. This paper presents common principles that need to be taken into account when…
Research on new optimization algorithms is often funded based on the motivation that such algorithms might improve the capabilities to deal with real-world and industrially relevant optimization challenges. Besides a huge variety of…
Sampling-based planning algorithms are the most common probabilistically complete algorithms and are widely used on many robot platforms. Within this class of algorithms, many variants have been proposed over the last 20 years, yet there is…
Comparing, or benchmarking, of optimization algorithms is a complicated task that involves many subtle considerations to yield a fair and unbiased evaluation. In this paper, we systematically review the benchmarking process of optimization…
As frontier artificial intelligence (AI) models rapidly advance, benchmarks are integral to comparing different models and measuring their progress in different task-specific domains. However, there is a lack of guidance on when and how…
Numerous neural network circuits and architectures are presently under active research for application to artificial intelligence and machine learning. Their physical performance metrics (area, time, energy) are estimated. Various types of…
Parametric stochastic simulators are ubiquitous in science, often featuring high-dimensional input parameters and/or an intractable likelihood. Performing Bayesian parameter inference in this context can be challenging. We present a neural…
Benchmarking is generally accepted as an important element in demonstrating the correctness of computer simulations. In the modern sense, a benchmark is a computer simulation result that has evidence of correctness, is accompanied by…
Studies on simulation input uncertainty often built on the availability of input data. In this paper, we investigate an inverse problem where, given only the availability of output data, we nonparametrically calibrate the input models and…
The development of state-of-the-art systems in different applied areas of machine learning (ML) is driven by benchmarks, which have shaped the paradigm of evaluating generalisation capabilities from multiple perspectives. Although the…