Related papers: Addressing the Impact of Data Truncation and Param…
Accurate modeling of operational risk is important for a bank and the finance industry as a whole to prepare for potentially catastrophic losses. One approach to modeling operational is the loss distribution approach, which requires a bank…
Many banks adopt the Loss Distribution Approach to quantify the operational risk capital charge under Basel II requirements. It is common practice to estimate the capital charge using the 0.999 quantile of the annual loss distribution,…
Typically, operational risk losses are reported above a threshold. Fitting data reported above a constant threshold is a well known and studied problem. However, in practice, the losses are scaled for business and other factors before the…
In clinical and epidemiological research doubly truncated data often appear. This is the case, for instance, when the data registry is formed by interval sampling. Double truncation generally induces a sampling bias on the target variable,…
The finite sensitivity of instruments or detection methods means that data sets in many areas of astronomy, for example cosmological or exoplanet surveys, are necessarily systematically incomplete. Such data sets, where the population being…
Missing data imputation, where a model is trained on observed data to estimate unobserved values, is a fundamental problem in machine learning. In this paper, we rigorously formulate imputation model learning as a mean-squared error risk…
Estimating average treatment effects from observational data is challenging under practical violations of the positivity assumption. Targeted Maximum Likelihood Estimators (TMLEs) are widely used because of their double robustness and…
For a sample of Exponentially distributed durations we aim at point estimation and a confidence interval for its parameter. A duration is only observed if it has ended within a certain time interval, determined by a Uniform distribution.…
Insurance loss data are usually in the form of left-truncation and right-censoring due to deductibles and policy limits respectively. This paper investigates the model uncertainty and selection procedure when various parametric models are…
Stratification in both the design and analysis of randomized clinical trials is common. Despite features in automated randomization systems to re-confirm the stratifying variables, incorrect values of these variables may be entered. These…
Data analyses typically rely upon assumptions about missingness mechanisms that lead to observed versus missing data. When the data are missing not at random, direct assumptions about the missingness mechanism, and indirect assumptions…
Off-Policy Evaluation (OPE) aims to estimate the value of a target policy using offline data collected from potentially different policies. In real-world applications, however, logged data often suffers from missingness. While OPE has been…
In recent times we hear increasingly often about cyber attacks on various commercial and strategic sites that manage to escape any defense. In this article, we model such attacks on networks via stochastic processes and predict the time of…
Missing Not At Random (MNAR) values lead to significant biases in the data, since the probability of missingness depends on the unobserved values.They are ''not ignorable'' in the sense that they often require defining a model for the…
Dynamical modelling lies at the heart of our understanding of physical systems. Its role in science is deeper than mere operational forecasting, in that it allows us to evaluate the adequacy of the mathematical structure of our models.…
Bayesian regression determines model parameters by minimizing the expected loss, an upper bound to the true generalization error. However, the loss ignores misspecification, where models are imperfect. Parameter uncertainties from Bayesian…
This paper explores the estimation of a panel data model with cross-sectional interaction that is flexible both in its approach to specifying the network of connections between cross-sectional units, and in controlling for unobserved…
Databases derived from electronic health records (EHRs) are commonly subject to left truncation, a type of selection bias induced due to patients needing to survive long enough to satisfy certain entry criteria. Standard methods to adjust…
Parameter estimation is a foundational step in statistical modeling, enabling us to extract knowledge from data and apply it effectively. Bayesian estimation of parameters incorporates prior beliefs with observed data to infer distribution…
To quantify an operational risk capital charge under Basel II, many banks adopt a Loss Distribution Approach. Under this approach, quantification of the frequency and severity distributions of operational risk involves the bank's internal…