Related papers: Benchmarking Simulation-Based Inference
The prevalence of scientific workflows with high computational demands calls for their execution on various distributed computing platforms, including large-scale leadership-class high-performance computing (HPC) clusters. To handle the…
Uncertainty quantification for estimation through stochastic optimization solutions in an online setting has gained popularity recently. This paper introduces a novel inference method focused on constructing confidence intervals with…
Complex simulator-based models are now routinely used to perform inference across the sciences and engineering, but existing inference methods are often unable to account for outliers and other extreme values in data which occur due to…
Machine learning on graphs has made substantial progress across domains such as molecular property prediction and chip design. Yet benchmarking practices remain fragmented, often relying on narrow, task-specific datasets and inconsistent…
We study the benefit of modern simulation-based inference to constrain particle interactions at the LHC. We explore ways to incorporate known physics structures into likelihood estimation, specifically morphing-aware estimation and…
There has been a great deal of recent interest in methods for performing lifted inference; however, most of this work assumes that the first-order model is given as input to the system. Here, we describe lifted inference algorithms that…
Benchmarking, which involves collecting reference datasets and demonstrating method performances, is a requirement for the development of new computational tools, but also becomes a domain of its own to achieve neutral comparisons of…
Transferability estimation metrics are used to find a high-performing pre-trained model for a given target task without fine-tuning models and without access to the source dataset. Despite the growing interest in developing such metrics,…
Scientific knowledge expands by observing the world, hypothesizing some theories about it, and testing them against collected data. When those theories take the form of statistical models, statistical analyses are involved in the process of…
AI research often emphasizes model design and algorithmic performance, while deployment and inference remain comparatively underexplored despite being critical for real-world use. This study addresses that gap by investigating the…
Method comparisons are essential to provide recommendations and guidance for applied researchers, who often have to choose from a plethora of available approaches. While many comparisons exist in the literature, these are often not neutral…
Stochastic process discovery is concerned with deriving a model capable of reproducing the stochastic character of observed executions of a given process, stored in a log. This leads to an optimisation problem in which the model's parameter…
Benchmarks are essential for unified evaluation and reproducibility. The rapid rise of Artificial Intelligence for Software Engineering (AI4SE) has produced numerous benchmarks for tasks such as code generation and bug repair. However, this…
The rapid proliferation of benchmarks for evaluating large language models (LLMs) has created an urgent need for systematic methods to assess benchmark quality itself. We propose Benchmark^2, a comprehensive framework comprising three…
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer models of the likelihood-to-evidence…
Robust discrete optimization is a highly active field of research where a plenitude of combinations between decision criteria, uncertainty sets and underlying nominal problems are considered. Usually, a robust problem becomes harder to…
The human brain has inspired novel concepts complementary to classical and quantum computing architectures, such as artificial neural networks and neuromorphic computers, but it is not clear how their performances compare. Here we report a…
Neural simulation-based inference is a powerful class of machine-learning-based methods for statistical inference that naturally handles high-dimensional parameter estimation without the need to bin data into low-dimensional summary…
Drawing on ideas from continuous integration, we present concepts of an automated benchmarking pipeline for high performance applications. Customization and collaboration have been key design goals owing to the requirements of…
The validation of any database mining methodology goes through an evaluation process where benchmarks availability is essential. In this paper, we aim to randomly generate relational database benchmarks that allow to check probabilistic…