Related papers: Algorithms and Bounds for Rollout Sampling Approxi…

Rollout Sampling Approximate Policy Iteration

Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schemes without value functions which focus on policy…

Machine Learning · Computer Science 2008-07-06 Christos Dimitrakakis , Michail G. Lagoudakis

Policy Optimization Through Approximate Importance Sampling

Recent policy optimization approaches (Schulman et al., 2015a; 2017) have achieved substantial empirical successes by constructing new proxy optimization objectives. These proxy objectives allow stable and low variance policy learning, but…

Machine Learning · Computer Science 2020-02-24 Marcin B. Tomczak , Dongho Kim , Peter Vrancx , Kee-Eung Kim

Optimizing adaptive sampling via Policy Ranking

Efficient sampling in biomolecular simulations is critical for accurately capturing the complex dynamical behaviors of biological systems. Adaptive sampling techniques aim to improve efficiency by focusing computational resources on the…

Biomolecules · Quantitative Biology 2024-10-22 Hassan Nadeem , Diwakar Shukla

Classification-based Approximate Policy Iteration: Experiments and Extended Discussions

Tackling large approximate dynamic programming or reinforcement learning problems requires methods that can exploit regularities, or intrinsic structure, of the problem in hand. Most current methods are geared towards exploiting the…

Machine Learning · Computer Science 2014-07-03 Amir-massoud Farahmand , Doina Precup , André M. S. Barreto , Mohammad Ghavamzadeh

Is Your Imitation Learning Policy Better than Mine? Policy Comparison with Near-Optimal Stopping

Imitation learning has enabled robots to perform complex, long-horizon tasks in challenging dexterous manipulation settings. As new methods are developed, they must be rigorously evaluated and compared against corresponding baselines…

Robotics · Computer Science 2025-06-09 David Snyder , Asher James Hancock , Apurva Badithela , Emma Dixon , Patrick Miller , Rares Andrei Ambrus , Anirudha Majumdar , Masha Itkina , Haruki Nishimura

Sampling-guided exploration of active feature selection policies

Determining the most appropriate features for machine learning predictive models is challenging regarding performance and feature acquisition costs. In particular, global feature choice is limited given that some features will only benefit…

Machine Learning · Computer Science 2026-03-17 Gabriel Bernardino , Anders Jonsson , Patrick Clarysse , Nicolas Duchateau

Learning Implicit Sampling Distributions for Motion Planning

Sampling-based motion planners have experienced much success due to their ability to efficiently and evenly explore the state space. However, for many tasks, it may be more efficient to not uniformly explore the state space, especially when…

Robotics · Computer Science 2018-06-07 Clark Zhang , Jinwook Huh , Daniel D. Lee

Marginalized State Distribution Entropy Regularization in Policy Optimization

Entropy regularization is used to get improved optimization performance in reinforcement learning tasks. A common form of regularization is to maximize policy entropy to avoid premature convergence and lead to more stochastic policies for…

Machine Learning · Computer Science 2019-12-12 Riashat Islam , Zafarali Ahmed , Doina Precup

Efficient Sampling Policy for Selecting a Good Enough Subset

The note studies the problem of selecting a good enough subset out of a finite number of alternatives under a fixed simulation budget. Our work aims to maximize the posterior probability of correctly selecting a good subset. We formulate…

Optimization and Control · Mathematics 2023-05-09 Gongbo Zhang , Bin Chen , Qing-shan Jia , Yijie Peng

Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods

The policy gradient theorem is defined based on an objective with respect to the initial distribution over states. In the discounted case, this results in policies that are optimal for one distribution over initial states, but may not be…

Machine Learning · Computer Science 2019-12-12 Riashat Islam , Raihan Seraj , Pierre-Luc Bacon , Doina Precup

Enhancing Stratified Graph Sampling Algorithms based on Approximate Degree Distribution

Sampling technique has become one of the recent research focuses in the graph-related fields. Most of the existing graph sampling algorithms tend to sample the high degree or low degree nodes in the complex networks because of the…

Social and Information Networks · Computer Science 2018-02-02 Junpeng Zhu , Hui Li , Mei Chen , Zhenyu Dai , Ming Zhu

The Sample Complexity of Search over Multiple Populations

This paper studies the sample complexity of searching over multiple populations. We consider a large number of populations, each corresponding to either distribution P0 or P1. The goal of the search problem studied here is to find one…

Information Theory · Computer Science 2016-11-17 Matthew L. Malloy , Gongguo Tang , Robert D. Nowak

Optimal Single-Policy Sample Complexity and Transient Coverage for Average-Reward Offline RL

We study offline reinforcement learning in average-reward MDPs, which presents increased challenges from the perspectives of distribution shift and non-uniform coverage, and has been relatively underexamined from a theoretical perspective.…

Machine Learning · Computer Science 2026-04-23 Matthew Zurek , Guy Zamir , Yudong Chen

Generalized Proximal Policy Optimization with Sample Reuse

In real-world decision making tasks, it is critical for data-driven reinforcement learning methods to be both stable and sample efficient. On-policy methods typically generate reliable policy improvement throughout training, while…

Machine Learning · Computer Science 2021-11-02 James Queeney , Ioannis Ch. Paschalidis , Christos G. Cassandras

Improving optimal subsampling through stratification

Recent works have proposed optimal subsampling algorithms to improve computational efficiency in large datasets and to design validation studies in the presence of measurement error. Existing approaches generally fall into two categories:…

Methodology · Statistics 2025-12-25 Jasper B. Yang , Thomas Lumley , Bryan E. Shepherd , Pamela A. Shaw

On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift

Policy gradient methods are among the most effective methods in challenging reinforcement learning problems with large state and/or action spaces. However, little is known about even their most basic theoretical convergence properties,…

Machine Learning · Computer Science 2020-10-16 Alekh Agarwal , Sham M. Kakade , Jason D. Lee , Gaurav Mahajan

Compressed imitation learning

In analogy to compressed sensing, which allows sample-efficient signal reconstruction given prior knowledge of its sparsity in frequency domain, we propose to utilize policy simplicity (Occam's Razor) as a prior to enable sample-efficient…

Machine Learning · Computer Science 2020-09-25 Nathan Zhao , Beicheng Lou

Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path

We study the sample complexity of learning an $\epsilon$-optimal policy in the Stochastic Shortest Path (SSP) problem. We first derive sample complexity bounds when the learner has access to a generative model. We show that there exists a…

Machine Learning · Computer Science 2022-10-12 Liyu Chen , Andrea Tirinzoni , Matteo Pirotta , Alessandro Lazaric

On the Sample Complexity of Reinforcement Learning with Policy Space Generalization

We study the optimal sample complexity in large-scale Reinforcement Learning (RL) problems with policy space generalization, i.e. the agent has a prior knowledge that the optimal policy lies in a known policy space. Existing results show…

Machine Learning · Computer Science 2020-08-18 Wenlong Mou , Zheng Wen , Xi Chen

Sample Complexity of Power System State Estimation using Matrix Completion

In this paper, we propose an analytical framework to quantify the amount of data samples needed to obtain accurate state estimation in a power system - a problem known as sample complexity analysis in computer science. Motivated by the…

Optimization and Control · Mathematics 2019-09-20 Joshua Comden , Marcello Colombino , Andrey Bernstein , Zhenhua Liu