English

Solving Parameter-Robust Avoid Problems with Unknown Feasibility using Reinforcement Learning

Machine Learning 2026-02-18 v1 Robotics Optimization and Control

Abstract

Recent advances in deep reinforcement learning (RL) have achieved strong results on high-dimensional control tasks, but applying RL to reachability problems raises a fundamental mismatch: reachability seeks to maximize the set of states from which a system remains safe indefinitely, while RL optimizes expected returns over a user-specified distribution. This mismatch can result in policies that perform poorly on low-probability states that are still within the safe set. A natural alternative is to frame the problem as a robust optimization over a set of initial conditions that specify the initial state, dynamics and safe set, but whether this problem has a solution depends on the feasibility of the specified set, which is unknown a priori. We propose Feasibility-Guided Exploration (FGE), a method that simultaneously identifies a subset of feasible initial conditions under which a safe policy exists, and learns a policy to solve the reachability problem over this set of initial conditions. Empirical results demonstrate that FGE learns policies with over 50% more coverage than the best existing method for challenging initial conditions across tasks in the MuJoCo simulator and the Kinetix simulator with pixel observations.

Keywords

Cite

@article{arxiv.2602.15817,
  title  = {Solving Parameter-Robust Avoid Problems with Unknown Feasibility using Reinforcement Learning},
  author = {Oswin So and Eric Yang Yu and Songyuan Zhang and Matthew Cleaveland and Mitchell Black and Chuchu Fan},
  journal= {arXiv preprint arXiv:2602.15817},
  year   = {2026}
}

Comments

ICLR 2026. The project page can be found at https://oswinso.xyz/fge

R2 v1 2026-07-01T10:40:19.062Z