English
Related papers

Related papers: Optimizing Return Distributions with Distributiona…

200 papers

In dynamic programming (DP) and reinforcement learning (RL), an agent learns to act optimally in terms of expected long-term return by sequentially interacting with its environment modeled by a Markov decision process (MDP). More generally…

Machine Learning · Computer Science 2022-01-03 Mastane Achab , Gergely Neu

To date, distributional reinforcement learning (distributional RL) methods have exclusively focused on the discounted setting, where an agent aims to optimize a discounted sum of rewards over time. In this work, we extend distributional RL…

Machine Learning · Computer Science 2026-01-14 Juan Sebastian Rojas , Chi-Guhn Lee

What are the functionals of the reward that can be computed and optimized exactly in Markov Decision Processes?In the finite-horizon, undiscounted setting, Dynamic Programming (DP) can only handle these operations efficiently for certain…

Artificial Intelligence · Computer Science 2024-02-20 Alexandre Marthe , Aurélien Garivier , Claire Vernade

Differential Dynamic Programming (DDP) has become a well established method for unconstrained trajectory optimization. Despite its several applications in robotics and controls however, a widely successful constrained version of the…

Optimization and Control · Mathematics 2020-05-05 Yuichiro Aoyama , George Boutselis , Akash Patel , Evangelos A. Theodorou

We consider a general class of Dynamic Programming (DP) problems with non-separable objective functions. We show that for any problem in this class, there exists an augmented-state DP problem which satisfies the Principle of Optimality and…

Optimization and Control · Mathematics 2020-06-11 Morgan Jones , Matthew M. Peet

We introduce a novel class of algorithms to efficiently approximate the unknown return distributions in policy evaluation problems from distributional reinforcement learning (DRL). The proposed distributional dynamic programming algorithms…

Machine Learning · Statistics 2024-07-22 Julian Gerstenberg , Ralph Neininger , Denis Spiegel

With the development of deep learning, Dynamic Portfolio Optimization (DPO) problem has received a lot of attention in recent years, not only in the field of finance but also in the field of deep learning. Some advanced research in recent…

Computational Engineering, Finance, and Science · Computer Science 2025-01-16 Runsheng Lin , Zihan Xing , Mingze Ma , Raymond S. T. Lee

We propose and study a simple model of dynamical redistribution of capital in a diversified portfolio. We consider a hypothetical situation of a portfolio composed of N uncorrelated stocks. Each stock price follows a multiplicative random…

Statistical Mechanics · Physics 2015-06-25 Matteo Marsili , Sergei Maslov , Yi-Cheng Zhang

This paper presents a deep reinforcement learning (DRL) framework for dynamic portfolio optimization under market uncertainty and risk. The proposed model integrates a Sharpe ratio-based reward function with direct risk control mechanisms,…

Portfolio Management · Quantitative Finance 2025-11-17 Emmanuel Lwele , Sabuni Emmanuel , Sitali Gabriel Sitali

This paper develops a dynamic programming (DP) approach for decentralized stochastic optimal control problems with delayed sharing information patterns, which exhibits the fundamental Properties of classical DP of centralized partially…

Systems and Control · Electrical Eng. & Systems 2026-04-28 Charalambos D. Charalambous , Umarbek Guvercin , Seddik Djouadi

This article introduces a novel distributionally robust model predictive control (DRMPC) algorithm for a specific class of controlled dynamical systems where the disturbance multiplies the state and control variables. These classes of…

Optimization and Control · Mathematics 2024-10-04 Souvik Das , Siddhartha Ganguly , Ashwin Aravind , Debasish Chatterjee

Approximate dynamic programming is a popular method for solving large Markov decision processes. This paper describes a new class of approximate dynamic programming (ADP) methods- distributionally robust ADP-that address the curse of…

Machine Learning · Statistics 2012-05-22 Marek Petrik

Differential Dynamic Programming (DDP) is an efficient computational tool for solving nonlinear optimal control problems. It was originally designed as a single shooting method and thus is sensitive to the initial guess supplied. This work…

Robotics · Computer Science 2023-09-29 He Li , Wenhao Yu , Tingnan Zhang , Patrick M. Wensing

Distributionally robust optimization (DRO) problems are increasingly seen as a viable method to train machine learning models for improved model generalization. These min-max formulations, however, are more difficult to solve. We therefore…

Machine Learning · Statistics 2020-11-03 Soumyadip Ghosh , Mark Squillante , Ebisa Wollega

We present a unifying framework for designing and analysing distributional reinforcement learning (DRL) algorithms in terms of recursively estimating statistics of the return distribution. Our key insight is that DRL algorithms can be…

Machine Learning · Statistics 2019-02-22 Mark Rowland , Robert Dadashi , Saurabh Kumar , Rémi Munos , Marc G. Bellemare , Will Dabney

We consider non-standard Markov Decision Processes (MDPs) where the target function is not only a simple expectation of the accumulated reward. Instead, we consider rather general functionals of the joint distribution of terminal state and…

Optimization and Control · Mathematics 2025-10-16 Nicole Bäuerle , Tamara Göll , Anna Jaśkiewicz

Stochastic and (distributionally) robust optimization problems often become computationally challenging as the number of scenarios or data points increases. Scenario reduction is therefore a key technique for improving tractability. We…

Optimization and Control · Mathematics 2026-03-10 Kevin-Martin Aigner , Sebastian Denzler , Frauke Liers , Sebastian Pokutta , Kartikey Sharma

Deep Reinforcement Learning (DRL) algorithms can scale to previously intractable problems. The automation of profit generation in the stock market is possible using DRL, by combining the financial assets price "prediction" step and the…

Trading and Market Microstructure · Quantitative Finance 2022-09-20 Taylan Kabbani , Ekrem Duman

Differential Dynamic Programming (DDP) is one of the indirect methods for solving an optimal control problem. Several extensions to DDP have been proposed to add stagewise state and control constraints, which can mainly be classified as…

Optimization and Control · Mathematics 2024-09-19 Siddharth Prabhu , Srinivas Rangarajan , Mayuresh Kothare

Distributional reinforcement learning, which focuses on learning the entire return distribution instead of only its expectation in standard RL, has demonstrated remarkable success in enhancing performance. Despite these advancements, our…

Machine Learning · Computer Science 2024-09-24 Ke Sun , Bei Jiang , Linglong Kong
‹ Prev 1 2 3 10 Next ›