Related papers: Sequential Scoring Rule Evaluation for Forecast Me…
Forecasting and forecast evaluation are inherently sequential tasks. Predictions are often issued on a regular basis, such as every hour, day, or month, and their quality is monitored continuously. However, the classical statistical tools…
Probability forecasts for binary events play a central role in many applications. Their quality is commonly assessed with proper scoring rules, which assign forecasts a numerical score such that a correct forecast achieves a minimal…
The sequential analysis of series often requires nonparametric procedures, where the most powerful ones frequently use rank transformations. Re-ranking the data sequence after each new observation can become too intensive computationally.…
Modeling correlated or highly stratified multiple-response data becomes a common data analysis task due to modern data monitoring facilities and methods. Generalized estimating equations (GEE) is one of the popular statistical methods for…
In deploying artificial intelligence (AI) models, selective prediction offers the option to abstain from making a prediction when uncertain about model quality. To fulfill its promise, it is crucial to enforce strict and precise error…
The experimental evaluation of the methods and concepts covered in software engineering has been increasingly valued. This value indicates the constant search for new forms of assessment and validation of the results obtained in Software…
Sequential monitoring of randomized trials traditionally relies on parametric assumptions or asymptotic approximations. We discuss a family of nonparametric sequential tests - collectively called e-RT - for binary, event-only, and…
Strictly proper scoring rules (SPSR) are incentive compatible for eliciting information about random variables from strategic agents when the principal can reward agents after the realization of the random variables. They also quantify the…
When predicting future events, it is common to issue forecasts that are probabilistic, in the form of probability distributions over the range of possible outcomes. Such forecasts can be evaluated using proper scoring rules. Proper scoring…
Sequential analysis encompasses simulation theories and methods where the sample size is determined dynamically based on accumulating data. Since the conceptual inception, numerous sequential stopping rules have been introduced, and many…
Context: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results. Objective: To reduce the inconsistency amongst validation study results and provide a more formal…
Sequential decision making significantly speeds up research and is more cost-effective compared to fixed-n methods. We present a method for sequential decision making for stratified count data that retains Type-I error guarantee or false…
We introduce Sequential Neural Posterior Score Estimation (SNPSE), a score-based method for Bayesian inference in simulator-based models. Our method, inspired by the remarkable success of score-based methods in generative modelling,…
Simulation can evaluate a statistical method for properties such as Type I Error, FDR, or bias on a grid of hypothesized parameter values. But what about the gaps between the grid-points? Continuous Simulation Extension (CSE) is a…
This paper studies the problem of high-dimensional multiple testing and sparse recovery from the perspective of sequential analysis. In this setting, the probability of error is a function of the dimension of the problem. A simple…
Financial statement auditing is conducted under a risk-based evidence approach to obtain reasonable assurance. In practice, auditors often perform additional sampling or related procedures when an initial sample does not provide a…
Sequential Recommender Systems (SRSs) have emerged as a highly efficient approach to recommendation systems. By leveraging sequential data, SRSs can identify temporal patterns in user behaviour, significantly improving recommendation…
Adaptive clinical trials rely on interim analyses, flexible stopping, and data-dependent design modifications that complicate statistical guarantees when fixed-horizon test statistics are repeatedly inspected or reused after adaptations.…
We consider the problem of sequential evaluation, in which an evaluator observes candidates in a sequence and assigns scores to these candidates in an online, irrevocable fashion. Motivated by the psychology literature that has studied…
We propose an adaptive sequential framework for testing two simple hypotheses that analytically ensures finite exposure to the less effective treatment. Our proposed procedure employs a likelihood ratio-driven adaptive allocation rule,…