Related papers: Self-guided Approximate Linear Programs

Partitioned Linear Programming Approximations for MDPs

Approximate linear programming (ALP) is an efficient approach to solving large factored Markov decision processes (MDPs). The main idea of the method is to approximate the optimal value function by a set of basis functions and optimize…

Artificial Intelligence · Computer Science 2012-06-18 Branislav Kveton , Milos Hauskrecht

Practical Linear Value-approximation Techniques for First-order MDPs

Recent work on approximate linear programming (ALP) techniques for first-order Markov Decision Processes (FOMDPs) represents the value function linearly w.r.t. a set of first-order basis functions and uses linear programming techniques to…

Artificial Intelligence · Computer Science 2012-07-02 Scott Sanner , Craig Boutilier

A Linearly Relaxed Approximate Linear Program for Markov Decision Processes

Approximate linear programming (ALP) and its variants have been widely applied to Markov Decision Processes (MDPs) with a large number of states. A serious limitation of ALP is that it has an intractable number of constraints, as a result…

Systems and Control · Computer Science 2017-04-11 Chandrashekar Lakshminarayanan , Shalabh Bhatnagar , Csaba Szepesvari

Approximate dynamic programming with $(\min,+)$ linear function approximation for Markov decision processes

Markov Decision Processes (MDP) is an useful framework to cast optimal sequential decision making problems. Given any MDP the aim is to find the optimal action selection mechanism i.e., the optimal policy. Typically, the optimal policy…

Systems and Control · Computer Science 2014-03-18 Chandrashekar Lakshminarayanan , Shalabh Bhatnagar

A Generalized Reduced Linear Program for Markov Decision Processes

Markov decision processes (MDPs) with large number of states are of high practical interest. However, conventional algorithms to solve MDP are computationally infeasible in this scenario. Approximate dynamic programming (ADP) methods tackle…

Systems and Control · Computer Science 2014-11-19 Chandrashekar Lakshminarayanan , Shalabh Bhatnagar

Efficient Planning in Large MDPs with Weak Linear Function Approximation

Large-scale Markov decision processes (MDPs) require planning algorithms with runtime independent of the number of states of the MDP. We consider the planning problem in MDPs using linear value function approximation with only weak…

Machine Learning · Computer Science 2020-07-14 Roshan Shariff , Csaba Szepesvári

An Analysis of State-Relevance Weights and Sampling Distributions on L1-Regularized Approximate Linear Programming Approximation Accuracy

Recent interest in the use of $L_1$ regularization in the use of value function approximation includes Petrik et al.'s introduction of $L_1$-Regularized Approximate Linear Programming (RALP). RALP is unique among $L_1$-regularized…

Artificial Intelligence · Computer Science 2014-04-25 Gavin Taylor , Connor Geer , David Piekut

Scalable Bilinear $\pi$ Learning Using State and Action Features

Approximate linear programming (ALP) represents one of the major algorithmic families to solve large-scale Markov decision processes (MDP). In this work, we study a primal-dual formulation of the ALP, and develop a scalable, model-free…

Machine Learning · Computer Science 2018-04-30 Yichen Chen , Lihong Li , Mengdi Wang

Approximate Dynamic Programming with Neural Networks in Linear Discrete Action Spaces

Real-world problems of operations research are typically high-dimensional and combinatorial. Linear programs are generally used to formulate and efficiently solve these large decision problems. However, in multi-period decision problems, we…

Machine Learning · Computer Science 2019-02-27 Wouter van Heeswijk , Han La Poutré

Approximate Linear Programming for First-order MDPs

We introduce a new approximate solution technique for first-order Markov decision processes (FOMDPs). Representing the value function linearly w.r.t. a set of first-order basis functions, we compute suitable weights by casting the…

Artificial Intelligence · Computer Science 2012-07-09 Scott Sanner , Craig Boutilier

Best Policy Identification in Linear MDPs

We investigate the problem of best policy identification in discounted linear Markov Decision Processes in the fixed confidence setting under a generative model. We first derive an instance-specific lower bound on the expected number of…

Machine Learning · Computer Science 2022-08-12 Jerome Taupin , Yassir Jedra , Alexandre Proutiere

Deep Amortized Inference for Probabilistic Programs

Probabilistic programming languages (PPLs) are a powerful modeling tool, able to represent any computable probability distribution. Unfortunately, probabilistic program inference is often intractable, and existing PPLs mostly rely on…

Artificial Intelligence · Computer Science 2016-10-19 Daniel Ritchie , Paul Horsfall , Noah D. Goodman

Linear Programming for Decision Processes with Partial Information

Markov Decision Processes (MDPs) are stochastic optimization problems that model situations where a decision maker controls a system based on its state. Partially observed Markov decision processes (POMDPs) are generalizations of MDPs where…

Optimization and Control · Mathematics 2019-03-26 Victor Cohen , Axel Parmentier

Relational Linear Programs

We propose relational linear programming, a simple framework for combing linear programs (LPs) and logic programs. A relational linear program (RLP) is a declarative LP template defining the objective and the constraints through the logical…

Artificial Intelligence · Computer Science 2014-10-14 Kristian Kersting , Martin Mladenov , Pavel Tokmakov

Approximate Linear Programming for Decentralized Policy Iteration in Cooperative Multi-agent Markov Decision Processes

In this work, we consider a cooperative multi-agent Markov decision process (MDP) involving m agents. At each decision epoch, all the m agents independently select actions in order to maximize a common long-term objective. In the policy…

Machine Learning · Computer Science 2024-05-01 Lakshmi Mandal , Chandrashekar Lakshminarayanan , Shalabh Bhatnagar

On Sample Complexity of Projection-Free Primal-Dual Methods for Learning Mixture Policies in Markov Decision Processes

We study the problem of learning policy of an infinite-horizon, discounted cost, Markov decision process (MDP) with a large number of states. We compute the actions of a policy that is nearly as good as a policy chosen by a suitable oracle…

Machine Learning · Computer Science 2019-09-02 Masoud Badiei Khuzani , Varun Vasudevan , Hongyi Ren , Lei Xing

Finite-State Approximations to Discounted and Average Cost Constrained Markov Decision Processes

In this paper, we consider the finite-state approximation of a discrete-time constrained Markov decision process (MDP) under the discounted and average cost criteria. Using the linear programming formulation of the constrained discounted…

Optimization and Control · Mathematics 2018-07-10 Naci Saldi

Fine-Tuning Language Models with Advantage-Induced Policy Alignment

Reinforcement learning from human feedback (RLHF) has emerged as a reliable approach to aligning large language models (LLMs) to human preferences. Among the plethora of RLHF techniques, proximal policy optimization (PPO) is of the most…

Computation and Language · Computer Science 2023-11-06 Banghua Zhu , Hiteshi Sharma , Felipe Vieira Frujeri , Shi Dong , Chenguang Zhu , Michael I. Jordan , Jiantao Jiao

Linear Programming for Large-Scale Markov Decision Problems

We consider the problem of controlling a Markov decision process (MDP) with a large state space, so as to minimize average cost. Since it is intractable to compete with the optimal policy for large scale problems, we pursue the more modest…

Optimization and Control · Mathematics 2014-02-28 Yasin Abbasi-Yadkori , Peter L. Bartlett , Alan Malek

Approximate Dynamic Programming For Linear Systems with State and Input Constraints

Enforcing state and input constraints during reinforcement learning (RL) in continuous state spaces is an open but crucial problem which remains a roadblock to using RL in safety-critical applications. This paper leverages invariant sets to…

Systems and Control · Electrical Eng. & Systems 2019-06-28 Ankush Chakrabarty , Rien Quirynen , Claus Danielson , Weinan Gao