Related papers: Multi-environment Invariance Learning with Missing…

Equivariance and Invariance Inductive Bias for Learning from Insufficient Data

We are interested in learning robust models from insufficient data, without the need for any externally pre-trained checkpoints. First, compared to sufficient data, we show why insufficient data renders the model more easily biased to the…

Computer Vision and Pattern Recognition · Computer Science 2022-09-07 Tan Wang , Qianru Sun , Sugiri Pranata , Karlekar Jayashree , Hanwang Zhang

Environment Invariant Linear Least Squares

This paper considers a multi-environment linear regression model in which data from multiple experimental settings are collected. The joint distribution of the response variable and covariates may vary across different environments, yet the…

Statistics Theory · Mathematics 2024-12-03 Jianqing Fan , Cong Fang , Yihong Gu , Tong Zhang

Learning Optimal Features via Partial Invariance

Learning models that are robust to distribution shifts is a key concern in the context of their real-life applicability. Invariant Risk Minimization (IRM) is a popular framework that aims to learn robust models from multiple environments.…

Machine Learning · Computer Science 2023-04-04 Moulik Choraria , Ibtihal Ferwana , Ankur Mani , Lav R. Varshney

Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification

Making predictions in an unseen environment given data from multiple training environments is a challenging task. We approach this problem from an invariance perspective, focusing on binary classification to shed light on general nonlinear…

Methodology · Statistics 2024-07-08 Austin Goddard , Kang Du , Yu Xiang

Invariant Causal Mechanisms through Distribution Matching

Learning representations that capture the underlying data generating process is a key problem for data efficient and robust use of neural networks. One key property for robustness which the learned representation should capture and which…

Machine Learning · Computer Science 2022-06-24 Mathieu Chevalley , Charlotte Bunne , Andreas Krause , Stefan Bauer

Robust Representation Learning through Explicit Environment Modeling

We consider learning from labeled data collected across multiple environments, where the data distribution may vary across these environments. This problem is commonly approached from a causal perspective, seeking invariant representations…

Machine Learning · Statistics 2026-04-30 Yuli Slavutsky , David M. Blei

On the Relation between Prediction and Imputation Accuracy under Missing Covariates

Missing covariates in regression or classification problems can prohibit the direct use of advanced tools for further analysis. Recent research has realized an increasing trend towards the usage of modern Machine Learning algorithms for…

Machine Learning · Statistics 2022-03-23 Burim Ramosaj , Justus Tulowietzki , Markus Pauly

ZIN: When and How to Learn Invariance Without Environment Partition?

It is commonplace to encounter heterogeneous data, of which some aspects of the data distribution may vary but the underlying causal mechanisms remain constant. When data are divided into distinct environments according to the…

Machine Learning · Computer Science 2022-10-12 Yong Lin , Shengyu Zhu , Lu Tan , Peng Cui

Balancing Fairness and Robustness via Partial Invariance

The Invariant Risk Minimization (IRM) framework aims to learn invariant features from a set of environments for solving the out-of-distribution (OOD) generalization problem. The underlying assumption is that the causal components of the…

Machine Learning · Computer Science 2021-12-28 Moulik Choraria , Ibtihal Ferwana , Ankur Mani , Lav R. Varshney

Learning Invariant Representations under General Interventions on the Response

It has become increasingly common nowadays to collect observations of feature and response pairs from different environments. As a consequence, one has to apply learned predictors to data with a different distribution due to distribution…

Methodology · Statistics 2023-10-31 Kang Du , Yu Xiang

On the consistency of supervised learning with missing values

In many application settings, the data have missing entries which make analysis challenging. An abundant literature addresses missing values in an inferential framework: estimating parameters and their variance from incomplete tables. Here,…

Machine Learning · Statistics 2024-03-22 Julie Josse , Jacob M. Chen , Nicolas Prost , Erwan Scornet , Gaël Varoquaux

Learning Causal Structures Using Regression Invariance

We study causal inference in a multi-environment setting, in which the functional relations for producing the variables from their direct causes remain the same across environments, while the distribution of exogenous noises may vary. We…

Machine Learning · Computer Science 2017-05-29 AmirEmad Ghassami , Saber Salehkaleybar , Negar Kiyavash , Kun Zhang

Robust semiparametric inference with missing data

Classical semiparametric inference with missing outcome data is not robust to contamination of the observed data and a single observation can have arbitrarily large influence on estimation of a parameter of interest. This sensitivity is…

Methodology · Statistics 2021-03-02 Eva Cantoni , Xavier de Luna

Causal Invariance Learning via Efficient Nonconvex Optimization

Identifying the causal relationship among variables from observational data is an important yet challenging task. This work focuses on identifying the direct causes of an outcome and estimating their magnitude, i.e., learning the causal…

Methodology · Statistics 2026-01-08 Zhenyu Wang , Yifan Hu , Peter Bühlmann , Zijian Guo

Bayesian Environment Invariant Regression

The availability of data from multiple heterogeneous environments has motivated methods that remain reliable under distributional shifts. When the joint distribution of response and predictors varies across environments, the response may…

Methodology · Statistics 2026-04-29 Ruqian Zhang , Juan Shen , Yijiao Zhang

Heterogeneity, Uncertainty and Learning: Semiparametric Identification and Estimation

We provide identification results for a broad class of learning models in which continuous outcomes depend on three types of unobservables: known heterogeneity, initially unknown heterogeneity that may be revealed over time, and transitory…

Econometrics · Economics 2025-06-25 Jackson Bunting , Paul Diegert , Arnaud Maurel

Repeated Environment Inference for Invariant Learning

We study the problem of invariant learning when the environment labels are unknown. We focus on the invariant representation notion when the Bayes optimal conditional label distribution is the same across different environments. Previous…

Machine Learning · Computer Science 2022-08-09 Aayush Mishra , Anqi Liu

Weighted Risk Invariance: Domain Generalization under Invariant Feature Shift

Learning models whose predictions are invariant under multiple environments is a promising approach for out-of-distribution generalization. Such models are trained to extract features $X_{\text{inv}}$ where the conditional distribution $Y…

Machine Learning · Computer Science 2024-07-29 Gina Wong , Joshua Gleason , Rama Chellappa , Yoav Wald , Anqi Liu

Efficient nonparametric causal inference with missing exposure information

Missing exposure information is a very common feature of many observational studies. Here we study identifiability and efficient estimation of causal effects on vector outcomes, in such cases where treatment is unconfounded but partially…

Methodology · Statistics 2020-02-04 Edward H. Kennedy

An efficient and doubly robust empirical likelihood approach for estimating equations with missing data

This paper considers an empirical likelihood inference for parameters defined by general estimating equations, when data are missing at random. The efficiency of existing estimators depends critically on correctly specifying the conditional…

Methodology · Statistics 2016-12-06 Tianqing Liu , Xiaohui Yuan , Zhaohai Li , Aiyi Liu