English
Related papers

Related papers: Assessing Model Generalization in Vicinity

200 papers

The key of the out-of-distribution (OOD) generalization is to generalize invariance from training domains to target domains. The variance risk extrapolation (V-REx) is a practical OOD method, which depends on a domain-level regularization…

Machine Learning · Computer Science 2021-04-12 Chuanlong Xie , Haotian Ye , Fei Chen , Yue Liu , Rui Sun , Zhenguo Li

Reliable generalization metrics are fundamental to the evaluation of machine learning models. Especially in high-stakes applications where labeled target data are scarce, evaluation of models' generalization performance under distribution…

Machine Learning · Computer Science 2026-04-10 Yunxiang Peng , Mengmeng Ma , Ziyu Yao , Xi Peng

Generalization to new samples is a fundamental rationale for statistical modeling. For this purpose, model validation is particularly important, but recent work in survey inference has suggested that simple aggregation of individual…

Methodology · Statistics 2024-04-15 Lauren Kennedy , Aki Vehtari , Andrew Gelman

Since out-of-distribution generalization is a generally ill-posed problem, various proxy targets (e.g., calibration, adversarial robustness, algorithmic corruptions, invariance across shifts) were studied across different research programs…

The NLP community typically relies on performance of a model on a held-out test set to assess generalization. Performance drops observed in datasets outside of official test sets are generally attributed to "out-of-distribution" effects.…

Computation and Language · Computer Science 2024-04-03 Aparna Elangovan , Jiayuan He , Yuan Li , Karin Verspoor

The growing prevalence of large language models (LLMs) and vision-language models (VLMs) has heightened the need for reliable techniques to determine whether a model has been fine-tuned from or is even identical to another. Existing…

Machine Learning · Computer Science 2025-09-30 Ruibo Chen , Sheng Zhang , Yihan Wu , Tong Zheng , Peihua Mai , Heng Huang

The vicinal risk minimization (VRM) principle, first proposed by \citet{vapnik1999nature}, is an empirical risk minimization (ERM) variant that replaces Dirac masses with vicinal functions. Although there is strong numerical evidence…

Machine Learning · Computer Science 2018-11-13 Chao Zhang , Min-Hsiu Hsieh , Dacheng Tao

Assessing model generalization under distribution shift is essential for real-world deployment, particularly when labeled test data is unavailable. This paper presents a unified and practical framework for unsupervised model evaluation and…

Machine Learning · Computer Science 2025-10-06 Weijian Deng , Weijie Tu , Ibrahim Radwan , Mohammad Abu Alsheikh , Stephen Gould , Liang Zheng

Psychology research focuses on interactions, and this has deep implications for inference from non-representative samples. For the goal of estimating average treatment effects, we propose to fit a model allowing treatment to interact with…

Applications · Statistics 2020-04-15 Lauren Kennedy , Andrew Gelman

The vicinal risk minimization (VRM) principle is an empirical risk minimization (ERM) variant that replaces Dirac masses with vicinal functions. There is strong numerical and theoretical evidence showing that VRM outperforms ERM in terms of…

Machine Learning · Computer Science 2021-10-19 Puneet Mangla , Vedant Singh , Shreyas Jayant Havaldar , Vineeth N Balasubramanian

With the recently rapid development in deep learning, deep neural networks have been widely adopted in many real-life applications. However, deep neural networks are also known to have very little control over its uncertainty for unseen…

Machine Learning · Computer Science 2019-04-23 Wenhu Chen , Yilin Shen , Hongxia Jin , William Wang

This paper strives for domain generalization, where models are trained exclusively on source domains before being deployed on unseen target domains. We follow the strict separation of source training and target testing, but exploit the…

Machine Learning · Computer Science 2024-07-02 Sameer Ambekar , Zehao Xiao , Jiayi Shen , Xiantong Zhen , Cees G. M. Snoek

Recent work has shown that deep generative models can assign higher likelihood to out-of-distribution data sets than to their training data (Nalisnick et al., 2019; Choi et al., 2019). We posit that this phenomenon is caused by a mismatch…

Machine Learning · Statistics 2019-10-17 Eric Nalisnick , Akihiro Matsukawa , Yee Whye Teh , Balaji Lakshminarayanan

We present the first method for assessing the relevance of a model-based clustering result in a general framework. Standard validation criteria, like the adjusted Rand index, rely on external labels to assess partition accuracy;…

Statistics Theory · Mathematics 2026-03-30 Salima El Kolei , Matthieu Marbac

When machine learning models are deployed on a test distribution different from the training distribution, they can perform poorly, but overestimate their performance. In this work, we aim to better estimate a model's performance under…

Machine Learning · Computer Science 2020-07-08 Ching-Yao Chuang , Antonio Torralba , Stefanie Jegelka

Estimating the generalization performance is practically challenging on out-of-distribution (OOD) data without ground-truth labels. While previous methods emphasize the connection between distribution difference and OOD accuracy, we show…

Machine Learning · Computer Science 2023-10-24 Renchunzi Xie , Hongxin Wei , Lei Feng , Yuzhou Cao , Bo An

Generalization is the ability of a model to predict on unseen domains and is a fundamental task in machine learning. Several generalization bounds, both theoretical and empirical have been proposed but they do not provide tight bounds .In…

Machine Learning · Computer Science 2021-01-19 Sumukh Aithal K , Dhruva Kashyap , Natarajan Subramanyam

Generalization is a central problem in Machine Learning. Indeed most prediction methods require careful calibration of hyperparameters usually carried out on a hold-out \textit{validation} dataset to achieve generalization. The main goal of…

Machine Learning · Statistics 2021-02-18 Karim Lounici , Katia Meziani , Benjamin Riu

The generalization performance of a risk prediction model can be evaluated by its calibration, which measures the agreement between predicted and observed outcomes on external validation data. Here, methods for assessing the calibration of…

Methodology · Statistics 2020-01-31 Moritz Berger , Matthias Schmid

Recent years have seen substantial advances in our understanding of high-dimensional ridge regression, but existing theories assume that training examples are independent. By leveraging techniques from random matrix theory and free…

Machine Learning · Statistics 2025-11-06 Alexander Atanasov , Jacob A. Zavatone-Veth , Cengiz Pehlevan
‹ Prev 1 2 3 10 Next ›