Related papers: ModelPred: A Framework for Predicting Trained Mode…

Datamodels: Predicting Predictions from Training Data

We present a conceptual framework, datamodeling, for analyzing the behavior of a model class in terms of the training data. For any fixed "target" example $x$, training set $S$, and learning algorithm, a datamodel is a parameterized…

Machine Learning · Statistics 2022-02-02 Andrew Ilyas , Sung Min Park , Logan Engstrom , Guillaume Leclerc , Aleksander Madry

Data Complexity-aware Deep Model Performance Forecasting

Deep learning models are widely used across computer vision and other domains. When working on the model induction, selecting the right architecture for a given dataset often relies on repetitive trial-and-error procedures. This procedure…

Machine Learning · Computer Science 2026-01-06 Yen-Chia Chen , Hsing-Kuo Pao , Hanjuan Huang

On the Generalization Ability of Unsupervised Pretraining

Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization. However, a rigorous understanding of how the representation function learned on an unlabeled…

Machine Learning · Computer Science 2024-03-12 Yuyang Deng , Junyuan Hong , Jiayu Zhou , Mehrdad Mahdavi

ModelDiff: A Framework for Comparing Learning Algorithms

We study the problem of (learning) algorithm comparison, where the goal is to find differences between models trained with two different learning algorithms. We begin by formalizing this goal as one of finding distinguishing feature…

Machine Learning · Computer Science 2022-11-23 Harshay Shah , Sung Min Park , Andrew Ilyas , Aleksander Madry

Generating Samples to Probe Trained Models

There is a growing need for investigating how machine learning models operate. With this work, we aim to understand trained machine learning models by questioning their data preferences. We propose a mathematical framework that allows us to…

Machine Learning · Computer Science 2025-12-22 Eren Mehmet Kıral , Nurşen Aydın , Ş. İlker Birbil

A Model-Based Approach to Imitation Learning through Multi-Step Predictions

Imitation learning is a widely used approach for training agents to replicate expert behavior in complex decision-making tasks. However, existing methods often struggle with compounding errors and limited generalization, due to the inherent…

Machine Learning · Computer Science 2025-04-21 Haldun Balim , Yang Hu , Yuyang Zhang , Na Li

Learning Sample Difficulty from Pre-trained Models for Reliable Prediction

Large-scale pre-trained models have achieved remarkable success in many applications, but how to leverage them to improve the prediction reliability of downstream models is undesirably under-explored. Moreover, modern neural networks have…

Machine Learning · Computer Science 2023-10-31 Peng Cui , Dan Zhang , Zhijie Deng , Yinpeng Dong , Jun Zhu

Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models

Transfer learning aims to leverage knowledge from pre-trained models to benefit the target task. Prior transfer learning work mainly transfers from a single model. However, with the emergence of deep models pre-trained from different…

Machine Learning · Computer Science 2022-11-07 Yang Shu , Zhangjie Cao , Ziyang Zhang , Jianmin Wang , Mingsheng Long

Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach

The emergence of pre-trained models has significantly impacted Natural Language Processing (NLP) and Computer Vision to relational datasets. Traditionally, these models are assessed through fine-tuned downstream tasks. However, this raises…

Computation and Language · Computer Science 2024-02-16 Prince Aboagye , Yan Zheng , Junpeng Wang , Uday Singh Saini , Xin Dai , Michael Yeh , Yujie Fan , Zhongfang Zhuang , Shubham Jain , Liang Wang , Wei Zhang

Statistical Foundations of Prior-Data Fitted Networks

Prior-data fitted networks (PFNs) were recently proposed as a new paradigm for machine learning. Instead of training the network to an observed training set, a fixed model is pre-trained offline on small, simulated training sets from a…

Machine Learning · Statistics 2023-05-19 Thomas Nagler

Preventing Failures Due to Dataset Shift: Learning Predictive Models That Transport

Classical supervised learning produces unreliable models when training and target distributions differ, with most existing solutions requiring samples from the target domain. We propose a proactive approach which learns a relationship in…

Machine Learning · Statistics 2019-03-01 Adarsh Subbaswamy , Peter Schulam , Suchi Saria

Pretraining and the Lasso

Pretraining is a popular and powerful paradigm in machine learning to pass information from one model to another. As an example, suppose one has a modest-sized dataset of images of cats and dogs, and plans to fit a deep neural network to…

Methodology · Statistics 2024-10-31 Erin Craig , Mert Pilanci , Thomas Le Menestrel , Balasubramanian Narasimhan , Manuel Rivas , Stein-Erik Gullaksen , Roozbeh Dehghannasiri , Julia Salzman , Jonathan Taylor , Robert Tibshirani

Pretraining Federated Text Models for Next Word Prediction

Federated learning is a decentralized approach for training models on distributed devices, by summarizing local changes and sending aggregate parameters from local models to the cloud rather than the data itself. In this research we employ…

Machine Learning · Computer Science 2020-08-19 Joel Stremmel , Arjun Singh

Supervised Pretraining for Molecular Force Fields and Properties Prediction

Machine learning approaches have become popular for molecular modeling tasks, including molecular force fields and properties prediction. Traditional supervised learning methods suffer from scarcity of labeled data for particular tasks,…

Chemical Physics · Physics 2022-11-29 Xiang Gao , Weihao Gao , Wenzhi Xiao , Zhirui Wang , Chong Wang , Liang Xiang

Selective Prediction via Training Dynamics

Selective Prediction is the task of rejecting inputs a model would predict incorrectly on. This involves a trade-off between input space coverage (how many data points are accepted) and model utility (how good is the performance on accepted…

Machine Learning · Computer Science 2025-07-08 Stephan Rabanser , Anvith Thudi , Kimia Hamidieh , Adam Dziedzic , Israfil Bahceci , Akram Bin Sediq , Hamza Sokun , Nicolas Papernot

Examining the Effect of Pre-training on Time Series Classification

Although the pre-training followed by fine-tuning paradigm is used extensively in many fields, there is still some controversy surrounding the impact of pre-training on the fine-tuning process. Currently, experimental findings based on text…

Machine Learning · Computer Science 2023-09-12 Jiashu Pu , Shiwei Zhao , Ling Cheng , Yongzhu Chang , Runze Wu , Tangjie Lv , Rongsheng Zhang

A Novel DNN Training Framework via Data Sampling and Multi-Task Optimization

Conventional DNN training paradigms typically rely on one training set and one validation set, obtained by partitioning an annotated dataset used for training, namely gross training set, in a certain way. The training set is used for…

Neural and Evolutionary Computing · Computer Science 2020-07-03 Boyu Zhang , A. K. Qin , Hong Pan , Timos Sellis

Why pre-training is beneficial for downstream classification tasks?

Pre-training has exhibited notable benefits to downstream tasks by boosting accuracy and speeding up convergence, but the exact reasons for these benefits still remain unclear. To this end, we propose to quantitatively and explicitly…

Machine Learning · Computer Science 2024-10-14 Xin Jiang , Xu Cheng , Zechao Li

These Are Not All the Features You Are Looking For: A Fundamental Bottleneck in Supervised Pretraining

Transfer learning is widely used to adapt large pretrained models to new tasks with only a small amount of new data. However, a challenge persists -- the features from the original task often do not fully cover what is needed for unseen…

Machine Learning · Computer Science 2026-02-10 Xingyu Alice Yang , Jianyu Zhang , Léon Bottou

Meta-learning autoencoders for few-shot prediction

Compared to humans, machine learning models generally require significantly more training examples and fail to extrapolate from experience to solve previously unseen challenges. To help close this performance gap, we augment single-task…

Machine Learning · Computer Science 2018-07-27 Tailin Wu , John Peurifoy , Isaac L. Chuang , Max Tegmark