Related papers: Benchmarking Simulation-Based Inference

WfBench: Automated Generation of Scientific Workflow Benchmarks

The prevalence of scientific workflows with high computational demands calls for their execution on various distributed computing platforms, including large-scale leadership-class high-performance computing (HPC) clusters. To handle the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-10 Tainã Coleman , Henri Casanova , Ketan Maheshwari , Loïc Pottier , Sean R. Wilkinson , Justin Wozniak , Frédéric Suter , Mallikarjun Shankar , Rafael Ferreira da Silva

High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

Uncertainty quantification for estimation through stochastic optimization solutions in an online setting has gained popularity recently. This paper introduces a novel inference method focused on constructing confidence intervals with…

Machine Learning · Statistics 2026-03-24 Wanrong Zhu , Zhipeng Lou , Ziyang Wei , Wei Biao Wu

Amortised and provably-robust simulation-based inference

Complex simulator-based models are now routinely used to perform inference across the sciences and engineering, but existing inference methods are often unable to account for outliers and other extreme values in data which occur due to…

Machine Learning · Statistics 2026-02-18 Ayush Bharti , Charita Dellaporta , Yuga Hikida , François-Xavier Briol

GraphBench: Next-generation graph learning benchmarking

Machine learning on graphs has made substantial progress across domains such as molecular property prediction and chip design. Yet benchmarking practices remain fragmented, often relying on narrow, task-specific datasets and inconsistent…

Machine Learning · Computer Science 2026-05-12 Timo Stoll , Chendi Qian , Ben Finkelshtein , Ali Parviz , Darius Weber , Fabrizio Frasca , Hadar Shavit , Antoine Siraudin , Arman Mielke , Marie Anastacio , Erik Müller , Maya Bechler-Speicher , Michael Bronstein , Mikhail Galkin , Holger Hoos , Mathias Niepert , Bryan Perozzi , Jan Tönshoff , Christopher Morris

Advancing Tools for Simulation-Based Inference

We study the benefit of modern simulation-based inference to constrain particle interactions at the LHC. We explore ways to incorporate known physics structures into likelihood estimation, specifically morphing-aware estimation and…

High Energy Physics - Phenomenology · Physics 2025-10-01 Henning Bahl , Victor Bresó , Giovanni De Crescenzo , Tilman Plehn

Bisimulation-based Approximate Lifted Inference

There has been a great deal of recent interest in methods for performing lifted inference; however, most of this work assumes that the first-order model is given as input to the system. Here, we describe lifted inference algorithms that…

Artificial Intelligence · Computer Science 2012-05-14 Prithviraj Sen , Amol Deshpande , Lise Getoor

Building a continuous benchmarking ecosystem in bioinformatics

Benchmarking, which involves collecting reference datasets and demonstrating method performances, is a requirement for the development of new computational tools, but also becomes a domain of its own to achieve neutral comparisons of…

Other Quantitative Biology · Quantitative Biology 2025-07-24 Izaskun Mallona , Charlotte Soneson , Ben Carrillo , Almut Luetge , Daniel Incicau , Reto Gerber , Anthony Sonrel , Mark D. Robinson

How NOT to benchmark your SITE metric: Beyond Static Leaderboards and Towards Realistic Evaluation

Transferability estimation metrics are used to find a high-performing pre-trained model for a given target task without fine-tuning models and without access to the source dataset. Despite the growing interest in developing such metrics,…

Machine Learning · Computer Science 2025-10-09 Prabhant Singh , Sibylle Hess , Joaquin Vanschoren

Towards Reliable Simulation-based Inference

Scientific knowledge expands by observing the world, hypothesizing some theories about it, and testing them against collected data. When those theories take the form of statistical models, statistical analyses are involved in the process of…

Machine Learning · Statistics 2026-03-11 Arnaud Delaunoy

Scalable AI Inference: Performance Analysis and Optimization of AI Model Serving

AI research often emphasizes model design and algorithmic performance, while deployment and inference remain comparatively underexplored despite being critical for real-world use. This study addresses that gap by investigating the…

Machine Learning · Computer Science 2026-04-23 Hung Cuong Pham , Fatih Gedikli

On the role of benchmarking data sets and simulations in method comparison studies

Method comparisons are essential to provide recommendations and guidance for applied researchers, who often have to choose from a plethora of available approaches. While many comparisons exist in the literature, these are often not neutral…

Methodology · Statistics 2022-12-07 Sarah Friedrich , Tim Friede

Statistical process discovery

Stochastic process discovery is concerned with deriving a model capable of reproducing the stochastic character of observed executions of a given process, stored in a log. This leads to an optimisation problem in which the model's parameter…

Formal Languages and Automata Theory · Computer Science 2025-05-01 Pierre Cry , Paolo Ballarini , András Horváth , Pascale Le Gall

Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Unified Approach for Elevating Benchmark Quality

Benchmarks are essential for unified evaluation and reproducibility. The rapid rise of Artificial Intelligence for Software Engineering (AI4SE) has produced numerous benchmarks for tasks such as code generation and bug repair. However, this…

Software Engineering · Computer Science 2025-12-15 Roham Koohestani , Philippe de Bekker , Begüm Koç , Maliheh Izadi

Benchmark^2: Systematic Evaluation of LLM Benchmarks

The rapid proliferation of benchmarks for evaluating large language models (LLMs) has created an urgent need for systematic methods to assess benchmark quality itself. We propose Benchmark^2, a comprehensive framework comprising three…

Computation and Language · Computer Science 2026-01-08 Qi Qian , Chengsong Huang , Jingwen Xu , Changze Lv , Muling Wu , Wenhao Liu , Xiaohua Wang , Zhenghua Wang , Zisu Huang , Muzhao Tian , Jianhan Xu , Kun Hu , He-Da Wang , Yao Hu , Xuanjing Huang , Xiaoqing Zheng

MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood Inference from Sampled Trajectories

Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer models of the likelihood-to-evidence…

Machine Learning · Computer Science 2022-06-08 Giulio Isacchini , Natanael Spisak , Armita Nourmohammad , Thierry Mora , Aleksandra M. Walczak

Benchmarking Problems for Robust Discrete Optimization

Robust discrete optimization is a highly active field of research where a plenitude of combinations between decision criteria, uncertainty sets and underlying nominal problems are considered. Usually, a robust problem becomes harder to…

Optimization and Control · Mathematics 2022-01-14 Marc Goerigk , Mohammad Khosravi

Benchmarking the human brain against computational architectures

The human brain has inspired novel concepts complementary to classical and quantum computing architectures, such as artificial neural networks and neuromorphic computers, but it is not clear how their performances compare. Here we report a…

Neurons and Cognition · Quantitative Biology 2023-05-25 Céline van Valkenhoef , Catherine Schuman , Philip Walther

An implementation of neural simulation-based inference for parameter estimation in ATLAS

Neural simulation-based inference is a powerful class of machine-learning-based methods for statistical inference that naturally handles high-dimensional parameter estimation without the need to bin data into low-dimensional summary…

Data Analysis, Statistics and Probability · Physics 2025-06-16 ATLAS Collaboration

Continuous benchmarking: Keeping pace with an evolving ecosystem of models and technologies

Drawing on ideas from continuous integration, we present concepts of an automated benchmarking pipeline for high performance applications. Customization and collaboration have been key design goals owing to the requirements of…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-28 Jan Vogelsang , Melissa Lober , Catherine Mia Schöfmann , José Villamar , Dennis Terhorst , Johanna Senk , Hans Ekkehard Plesser , Markus Diesmann , Susanne Kunkel , Anno C. Kurth

Probabilistic Relational Model Benchmark Generation

The validation of any database mining methodology goes through an evaluation process where benchmarks availability is essential. In this paper, we aim to randomly generate relational database benchmarks that allow to check probabilistic…

Machine Learning · Computer Science 2016-03-03 Mouna Ben Ishak , Rajani Chulyadyo , Philippe Leray