English
Related papers

Related papers: Benchmarking Simulation-Based Inference

200 papers

Commonly, AI or machine learning (ML) models are evaluated on benchmark datasets. This practice supports innovative methodological research, but benchmark performance can be poorly correlated with performance in real-world applications -- a…

Machine Learning · Computer Science 2024-06-18 Olivier Binette , Jerome P. Reiter

The breakthrough in Deep Learning neural networks has transformed the use of AI and machine learning technologies for the analysis of very large experimental datasets. These datasets are typically generated by large-scale experimental…

Machine Learning · Computer Science 2021-10-26 Jeyan Thiyagalingam , Mallikarjun Shankar , Geoffrey Fox , Tony Hey

Benchmarking is essential for developing and evaluating black-box optimization algorithms, providing a structured means to analyze their search behavior. Its effectiveness relies on carefully selected problem sets used for evaluation. To…

Neural and Evolutionary Computing · Computer Science 2025-11-17 Iván Olarte Rodríguez , Maria Laura Santoni , Fabian Duddeck , Carola Doerr , Thomas Bäck , Elena Raponi

Some statistical models are specified via a data generating process for which the likelihood function cannot be computed in closed form. Standard likelihood-based inference is then not feasible but the model parameters can be inferred by…

Computation · Statistics 2015-02-20 Michael U. Gutmann , Jukka Corander , Ritabrata Dutta , Samuel Kaski

I would like to share recommendations on how to do performance benchmarks for the purpose of computer science research evaluation. Research in my field (programming language research) often involves performance considerations, but it is…

Programming Languages · Computer Science 2026-05-05 Gabriel Scherer

Simulation models of complex dynamics in the natural and social sciences commonly lack a tractable likelihood function, rendering traditional likelihood-based statistical inference impossible. Recent advances in machine learning have…

Machine Learning · Statistics 2022-02-24 Joel Dyer , Patrick Cannon , Sebastian M Schmon

Many promising approaches to symbolic regression have been presented in recent years, yet progress in the field continues to suffer from a lack of uniform, robust, and transparent benchmarking standards. In this paper, we address this…

Neural and Evolutionary Computing · Computer Science 2021-08-02 William La Cava , Patryk Orzechowski , Bogdan Burlacu , Fabrício Olivetti de França , Marco Virgolin , Ying Jin , Michael Kommenda , Jason H. Moore

Research on recommender systems algorithms, like other areas of applied machine learning, is largely dominated by efforts to improve the state-of-the-art, typically in terms of accuracy measures. Several recent research works however…

Information Retrieval · Computer Science 2022-05-16 Vito Walter Anelli , Alejandro Bellogín , Tommaso Di Noia , Dietmar Jannach , Claudio Pomo

Context: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results. Objective: To reduce the inconsistency amongst validation study results and provide a more formal…

Software Engineering · Computer Science 2021-01-15 Martin Shepperd , Stephen G. MacDonell

Reliable and robust evaluation methods are a necessary first step towards developing machine learning models that are themselves robust and reliable. Unfortunately, current evaluation protocols typically used to assess classifiers fail to…

Machine Learning · Computer Science 2025-05-26 Michael W. Spratling

The quest for precision in parameter estimation is a fundamental task in different scientific areas. The relevance of this problem thus provided the motivation to develop methods for the application of quantum resources to estimation…

Quantum Physics · Physics 2024-06-18 Valeria Cimini , Emanuele Polino , Mauro Valeri , Nicolò Spagnolo , Fabio Sciarrino

Artificial intelligence-based systems for player risk detection have become central to harm prevention efforts in the gambling industry. However, growing concerns around transparency and effectiveness have highlighted the absence of…

The world of empirical machine learning (ML) strongly relies on benchmarks in order to determine the relative effectiveness of different algorithms and methods. This paper proposes the notion of "a benchmark lottery" that describes the…

Machine Learning · Computer Science 2021-07-19 Mostafa Dehghani , Yi Tay , Alexey A. Gritsenko , Zhe Zhao , Neil Houlsby , Fernando Diaz , Donald Metzler , Oriol Vinyals

Deep learning algorithms have recently shown to be a successful tool in estimating parameters of statistical models for which simulation is easy, but likelihood computation is challenging. But the success of these approaches depends on…

Machine Learning · Statistics 2024-02-20 Amanda Lenzi , Haavard Rue

Requirements driven search-based testing (also known as falsification) has proven to be a practical and effective method for discovering erroneous behaviors in Cyber-Physical Systems. Despite the constant improvements on the performance and…

In recent years, researchers in decision analysis and artificial intelligence (Al) have used Bayesian belief networks to build models of expert opinion. Using standard methods drawn from the theory of computational complexity, workers in…

Artificial Intelligence · Computer Science 2013-04-08 R. Martin Chavez , Gregory F. Cooper

Benchmarking has driven scientific progress in Evolutionary Computation, yet current practices fall short of real-world needs. Widely used synthetic suites such as BBOB and CEC isolate algorithmic phenomena but poorly reflect the structure,…

In scientific computing, it is common that a mathematical expression can be computed by many different algorithms (sometimes over hundreds), each identifying a specific sequence of library calls. Although mathematically equivalent, those…

Performance · Computer Science 2021-09-15 Aravind Sankaran , Paolo Bientinesi

Predicting the performance and energy consumption of computing hardware is critical for many modern applications. This will inform procurement decisions, deployment decisions, and autonomic scaling. Existing approaches to understanding the…

Machine Learning · Computer Science 2023-02-28 Mehmet Cengiz , Matthew Forshaw , Amir Atapour-Abarghouei , Andrew Stephen McGough

Quantum optimisation is emerging as a promising approach alongside classical heuristics and specialised hardware, yet its performance is often difficult to assess fairly. Traditional benchmarking methods, rooted in digital complexity…

Quantum Physics · Physics 2025-12-10 Frank Phillipson