Related papers: Changing Model Behavior at Test-Time Using Reinfor…

Reinforcement Learning with an Abrupt Model Change

The problem of reinforcement learning is considered where the environment or the model undergoes a change. An algorithm is proposed that an agent can apply in such a problem to achieve the optimal long-time discounted reward. The algorithm…

Systems and Control · Electrical Eng. & Systems 2023-04-25 Wuxia Chen , Taposh Banerjee , Jemin George , Carl Busart

Mixup for Test-Time Training

Test-time training provides a new approach solving the problem of domain shift. In its framework, a test-time training phase is inserted between training phase and test phase. During test-time training phase, usually parts of the model are…

Machine Learning · Computer Science 2022-10-05 Bochao Zhang , Rui Shao , Jingda Du , PC Yuen

RLTutor: Reinforcement Learning Based Adaptive Tutoring System by Modeling Virtual Student with Fewer Interactions

A major challenge in the field of education is providing review schedules that present learned items at appropriate intervals to each student so that memory is retained over time. In recent years, attempts have been made to formulate item…

Artificial Intelligence · Computer Science 2021-08-03 Yoshiki Kubotani , Yoshihiro Fukuhara , Shigeo Morishima

Optimizing Language Models for Inference Time Objectives using Reinforcement Learning

In this work, we investigate the merits of explicitly optimizing for inference time algorithmic performance during model training. We show how optimizing for inference time performance can improve overall model efficacy. We consider generic…

Machine Learning · Computer Science 2025-08-19 Yunhao Tang , Kunhao Zheng , Gabriel Synnaeve , Rémi Munos

Model-Based Reinforcement Learning for Control under Time-Varying Dynamics

Learning-based control methods typically assume stationary system dynamics, an assumption often violated in real-world systems due to drift, wear, or changing operating conditions. We study reinforcement learning for control under…

Machine Learning · Computer Science 2026-04-03 Klemens Iten , Bruce Lee , Chenhao Li , Lenart Treven , Andreas Krause , Bhavya Sukhija

Time Limits in Reinforcement Learning

In reinforcement learning, it is common to let an agent interact for a fixed amount of time with its environment before resetting it and repeating the process in a series of episodes. The task that the agent has to learn can either be to…

Machine Learning · Computer Science 2022-01-28 Fabio Pardo , Arash Tavakoli , Vitaly Levdik , Petar Kormushev

Boosting Reinforcement Learning and Planning with Demonstrations: A Survey

Although reinforcement learning has seen tremendous success recently, this kind of trial-and-error learning can be impractical or inefficient in complex environments. The use of demonstrations, on the other hand, enables agents to benefit…

Machine Learning · Computer Science 2023-03-29 Tongzhou Mu , Hao Su

A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning

The aim of multi-task reinforcement learning is two-fold: (1) efficiently learn by training against multiple tasks and (2) quickly adapt, using limited samples, to a variety of new tasks. In this work, the tasks correspond to reward…

Machine Learning · Computer Science 2019-11-05 Nicholas C. Landolfi , Garrett Thomas , Tengyu Ma

Reinforcement Learning for Machine Learning Model Deployment: Evaluating Multi-Armed Bandits in ML Ops Environments

In modern ML Ops environments, model deployment is a critical process that traditionally relies on static heuristics such as validation error comparisons and A/B testing. However, these methods require human intervention to adapt to…

Machine Learning · Computer Science 2025-03-31 S. Aaron McClendon , Vishaal Venkatesh , Juan Morinelli

Efficient Learning of Model Weights via Changing Features During Training

In this paper, we propose a machine learning model, which dynamically changes the features during training. Our main motivation is to update the model in a small content during the training process with replacing less descriptive features…

Machine Learning · Computer Science 2020-02-24 Marcell Beregi-Kovács , Ágnes Baran , András Hajdu

Accelerating Goal-Directed Reinforcement Learning by Model Characterization

We propose a hybrid approach aimed at improving the sample efficiency in goal-directed reinforcement learning. We do this via a two-step mechanism where firstly, we approximate a model from Model-Free reinforcement learning. Then, we…

Machine Learning · Computer Science 2019-01-09 Shoubhik Debnath , Gaurav Sukhatme , Lantao Liu

Model-based adaptation for sample efficient transfer in reinforcement learning control of parameter-varying systems

In this paper, we leverage ideas from model-based control to address the sample efficiency problem of reinforcement learning (RL) algorithms. Accelerating learning is an active field of RL highly relevant in the context of time-varying…

Systems and Control · Electrical Eng. & Systems 2023-05-23 Ibrahim Ahmed , Marcos Quinones-Grueiro , Gautam Biswas

On-Policy Model Errors in Reinforcement Learning

Model-free reinforcement learning algorithms can compute policy gradients given sampled environment transitions, but require large amounts of data. In contrast, model-based methods can use the learned model to generate new data, but model…

Machine Learning · Computer Science 2022-03-04 Lukas P. Fröhlich , Maksym Lefarov , Melanie N. Zeilinger , Felix Berkenkamp

Reinforcement Learning with Policy Mixture Model for Temporal Point Processes Clustering

Temporal point process is an expressive tool for modeling event sequences over time. In this paper, we take a reinforcement learning view whereby the observed sequences are assumed to be generated from a mixture of latent policies. The…

Machine Learning · Computer Science 2019-07-01 Weichang Wu , Junchi Yan , Xiaokang Yang , Hongyuan Zha

MEMO: Test Time Robustness via Adaptation and Augmentation

While deep neural networks can attain good accuracy on in-distribution test points, many applications require robustness even in the face of unexpected perturbations in the input, changes in the domain, or other sources of distribution…

Machine Learning · Computer Science 2022-10-12 Marvin Zhang , Sergey Levine , Chelsea Finn

Robust Constrained Reinforcement Learning

Constrained reinforcement learning is to maximize the expected reward subject to constraints on utilities/costs. However, the training environment may not be the same as the test one, due to, e.g., modeling error, adversarial attack,…

Machine Learning · Computer Science 2022-09-16 Yue Wang , Fei Miao , Shaofeng Zou

When to Trust Your Model: Model-Based Policy Optimization

Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data. In this paper, we study the role of model usage in policy…

Machine Learning · Computer Science 2021-11-30 Michael Janner , Justin Fu , Marvin Zhang , Sergey Levine

Test-Time Training with Masked Autoencoders

Test-time training adapts to a new test distribution on the fly by optimizing a model for each test input using self-supervision. In this paper, we use masked autoencoders for this one-sample learning problem. Empirically, our simple method…

Computer Vision and Pattern Recognition · Computer Science 2022-09-16 Yossi Gandelsman , Yu Sun , Xinlei Chen , Alexei A. Efros

Model Imitation for Model-Based Reinforcement Learning

Model-based reinforcement learning (MBRL) aims to learn a dynamic model to reduce the number of interactions with real-world environments. However, due to estimation error, rollouts in the learned model, especially those of long horizons,…

Machine Learning · Computer Science 2020-03-17 Yueh-Hua Wu , Ting-Han Fan , Peter J. Ramadge , Hao Su

TeST: Test-time Self-Training under Distribution Shift

Despite their recent success, deep neural networks continue to perform poorly when they encounter distribution shifts at test time. Many recently proposed approaches try to counter this by aligning the model to the new distribution prior to…

Computer Vision and Pattern Recognition · Computer Science 2022-09-26 Samarth Sinha , Peter Gehler , Francesco Locatello , Bernt Schiele