Related papers: A note on Influence diagnostics in nonlinear mixed…
Quantifying the influence of infinitesimal changes in training data on model performance is crucial for understanding and improving machine learning models. In this work, we reformulate this problem as a weighted empirical risk minimization…
We consider the issue of assessing influence of observations in the class of Birnbaum-Saunders nonlinear regression models, which is useful in lifetime data analysis. Our results generalize those in Galea et al. [2004, Influence diagnostics…
In this paper, we propose a simplex regression model in which both the mean and the dispersion parameters are related to covariates by nonlinear predictors. We provide closed-form expressions for the score function, for Fisher's information…
Linear mixed models are widely used to analyze non-independent data, but inference for fixed effects can be unreliable under misspecification of the random-effects distribution, inaccurate Fisher information estimation, or convergence…
The precision matrix that encodes conditional linear dependency relations among a set of variables forms an important object of interest in multivariate analysis. Sparse estimation procedures for precision matrices such as the graphical…
Despite the risk of misspecification they are tied to, parametric models continue to be used in statistical practice because they are accessible to all. In particular, efficient estimation procedures in parametric models are simple to…
Multivariate data occurs in a wide range of fields, with ever more flexible model specifications being proposed, often within a multivariate generalised linear mixed effects (MGLME) framework. In this article, we describe an extended…
Beta coefficients for linear regression models represent the ideal form of an interpretable feature effect. However, for non-linear models and especially generalized linear models, the estimated coefficients cannot be interpreted as a…
We propose and analyze estimators for statistical functionals of one or more distributions under nonparametric assumptions. Our estimators are based on the theory of influence functions, which appear in the semiparametric statistics…
Many useful parameters depend on nonparametric first steps. Examples include games, dynamic discrete choice, average exact consumer surplus, and treatment effects. Often estimators of these parameters are asymptotically equivalent to a…
The simple product formulae for derivatives of scalar functions raised to different powers are generalized for functions which take values in the set of symmetric positive definite matrices. These formulae are fundamental in derivation of…
Latent factor models (LFMs) such as matrix factorization achieve the state-of-the-art performance among various Collaborative Filtering (CF) approaches for recommendation. Despite the high recommendation accuracy of LFMs, a critical issue…
The science of cause and effect is extremely sophisticated and extremely hard to scale. Using a controlled experiment, scientists get rich insights by analyzing global effects, effects in different segments, and trends in effects over time.…
For latent class models where the class weights depend on individual covariates, we derive a simple expression for computing the score vector and a convenient hybrid between the observed and the expected information matrices which is always…
On one hand, a large class of inequality measures, which includes the generalized entropy, the Atkinson, the Gini, etc., for example, has been introduced in Mergane and Lo (2013). On the other hand, the influence function of statistics is…
In this work, we focus on the use of influence functions to identify relevant training examples that one might hope "explain" the predictions of a machine learning model. One shortcoming of influence functions is that the training examples…
Diffusion models have led to significant advancements in generative modelling. Yet their widespread adoption poses challenges regarding data attribution and interpretability. In this paper, we aim to help address such challenges in…
Influence functions (IFs) elucidate how training data changes model behavior. However, the increasing size and non-convexity in large-scale models make IFs inaccurate. We suspect that the fragility comes from the first-order approximation…
Influence functions estimate the effect of removing a training point on a model without the need to retrain. They are based on a first-order Taylor approximation that is guaranteed to be accurate for sufficiently small changes to the model,…
The efficient modeling for disorder in a phenomena depends on the chosen score and objective functions. The main parameters in modeling are location, scale and shape. The exponential power distribution known as generalized Gaussian is…