Related papers: Perturbation selection and influence measures in l…

Influence diagnostics in Birnbaum-Saunders nonlinear regression models

We consider the issue of assessing influence of observations in the class of Birnbaum-Saunders nonlinear regression models, which is useful in lifetime data analysis. Our results generalize those in Galea et al. [2004, Influence diagnostics…

Methodology · Statistics 2011-11-22 Artur J. Lemonte

Perturbation-based Effect Measures for Compositional Data

Existing effect measures for compositional features are inadequate for many modern applications, for example, in microbiome research, since they display traits such as high-dimensionality and sparsity that can be poorly modelled with…

Methodology · Statistics 2025-06-02 Anton Rask Lundborg , Niklas Pfister

Demystifying statistical learning based on efficient influence functions

Evaluation of treatment effects and more general estimands is typically achieved via parametric modelling, which is unsatisfactory since model misspecification is likely. Data-adaptive model building (e.g. statistical/machine learning) is…

Statistics Theory · Mathematics 2022-01-14 Oliver Hines , Oliver Dukes , Karla Diaz-Ordaz , Stijn Vansteelandt

Active Learning of Spin Network Models

The inverse statistical problem of finding direct interactions in complex networks is difficult. In the natural sciences, well-controlled perturbation experiments are widely used to probe the structure of complex networks. However, our…

Disordered Systems and Neural Networks · Physics 2019-10-24 Jialong Jiang , David A. Sivak , Matt Thomson

Global sensitivity analysis for optimization with variable selection

The optimization of high dimensional functions is a key issue in engineering problems but it frequently comes at a cost that is not acceptable since it usually involves a complex and expensive computer code. Engineers often overcome this…

Machine Learning · Statistics 2019-06-18 Adrien Spagnol , Rodolphe Le Riche , Sebastien Da Veiga

Choosing good subsamples for regression modelling

A common problem in health research is that we have a large database with many variables measured on a large number of individuals. We are interested in measuring additional variables on a subsample; these measurements may be newly…

Methodology · Statistics 2022-03-22 Thomas Lumley , Tong Chen

Supervising Feature Influence

Causal influence measures for machine learnt classifiers shed light on the reasons behind classification, and aid in identifying influential input features and revealing their biases. However, such analyses involve evaluating the classifier…

Machine Learning · Computer Science 2018-04-10 Shayak Sen , Piotr Mardziel , Anupam Datta , Matthew Fredrikson

Learning Local Metrics and Influential Regions for Classification

The performance of distance-based classifiers heavily depends on the underlying distance metric, so it is valuable to learn a suitable metric from the data. To address the problem of multimodality, it is desirable to learn local metrics. In…

Machine Learning · Computer Science 2018-02-13 Mingzhi Dong , Yujiang Wang , Xiaochen Yang , Jing-Hao Xue

Difference-in-Differences under Local Dependence on Networks

Estimating causal effects under interference, where the stable unit treatment value assumption is violated, is critical in fields such as regional and public economics. Much of the existing research on causal inference under interference…

Methodology · Statistics 2026-02-03 Akihiro Sato , Shonosuke Sugasawa

High-dimensional influence measure

Influence diagnosis is important since presence of influential observations could lead to distorted analysis and misleading interpretations. For high-dimensional data, it is particularly so, as the increased dimensionality and complexity…

Statistics Theory · Mathematics 2013-11-27 Junlong Zhao , Chenlei Leng , Lexin Li , Hansheng Wang

"Influence Sketching": Finding Influential Samples In Large-Scale Regressions

There is an especially strong need in modern large-scale data analysis to prioritize samples for manual inspection. For example, the inspection could target important mislabeled samples or key vulnerabilities exploitable by an adversarial…

Machine Learning · Statistics 2017-05-11 Mike Wojnowicz , Ben Cruz , Xuan Zhao , Brian Wallace , Matt Wolff , Jay Luan , Caleb Crable

Curvature as a tool for evaluating dimensionality reduction and estimating intrinsic dimension

Utilizing recently developed abstract notions of sectional curvature, we introduce a method for constructing a curvature-based geometric profile of discrete metric spaces. The curvature concept that we use here captures the metric relations…

Computer Vision and Pattern Recognition · Computer Science 2025-09-18 Charlotte Beylier , Parvaneh Joharinad , Jürgen Jost , Nahid Torbati

Causal Mediation Analysis: Selection with Asymptotically Valid Inference

Researchers are often interested in learning not only the effect of treatments on outcomes, but also the pathways through which these effects operate. A mediator is a variable that is affected by treatment and subsequently affects outcome.…

Methodology · Statistics 2021-12-22 Jeremiah Jones , Ashkan Ertefaie , Robert L. Strawderman

Designing Observables for Measurements with Deep Learning

Many analyses in particle and nuclear physics use simulations to infer fundamental, effective, or phenomenological parameters of the underlying physics models. When the inference is performed with unfolded cross sections, the observables…

Data Analysis, Statistics and Probability · Physics 2024-09-19 Owen Long , Benjamin Nachman

Optimal Sub-sampling with Influence Functions

Sub-sampling is a common and often effective method to deal with the computational challenges of large datasets. However, for most statistical models, there is no well-motivated approach for drawing a non-uniform subsample. We show that the…

Machine Learning · Statistics 2017-09-07 Daniel Ting , Eric Brochu

Perturbation-based inference for diffusion processes: Obtaining effective models from multiscale data

We consider the inference problem for parameters in stochastic differential equation models from discrete time observations (e.g. experimental or simulation data). Specifically, we study the case where one does not have access to…

Numerical Analysis · Mathematics 2018-04-10 Sebastian Krumscheid

Long Time Influence of Small Perturbations and Motion on the Simplex of Invariant Probability Measures

A general approach to a broad class of asymptotic problems related to long-time influence of small perturbations, of both deterministic and stochastic type, is presented in the paper. The main characteristic of this influence is a limiting…

Probability · Mathematics 2020-10-06 Mark Freidlin

A robust approach to model-based classification based on trimming and constraints

In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations,…

Applications · Statistics 2019-11-20 Andrea Cappozzo , Francesca Greselin , Thomas Brendan Murphy

Why did the shape of your network change? (On detecting network anomalies via non-local curvatures)

$Anomaly$ $detection$ problems (also called $change$-$point$ $detection$ problems) have been studied in data mining, statistics and computer science over the last several decades in applications such as medical condition monitoring and…

Data Structures and Algorithms · Computer Science 2019-12-23 Bhaskar DasGupta , Mano Vikash Janardhanan , Farzane Yahyanejad

Adjusting for Unmeasured Confounding in Marginal Structural Models with Propensity-Score Fixed Effects

Marginal structural models are a popular tool for investigating the effects of time-varying treatments, but they require an assumption of no unobserved confounders between the treatment and outcome. With observational data, this assumption…

Methodology · Statistics 2021-06-10 Matthew Blackwell , Soichiro Yamauchi