Related papers: Entropy Regularization under Bayesian Drift Uncert…
We study the Markowitz portfolio selection problem with unknown drift vector in the multidimensional framework. The prior belief on the uncertain expected rate of return is modeled by an arbitrary probability law, and a Bayesian approach…
This paper considers mean-variance optimization under uncertainty, specifically when one desires a sparsified set of optimal portfolio weights. From the standpoint of a Bayesian investor, our approach produces a small portfolio from many…
In this paper, we study the mean-variance portfolio selection problem under partial information with drift uncertainty. First we show that the market model is complete even in this case while the information is not complete and the drift is…
Motivated by the trade-off between exploitation and exploration in reinforcement learning, we study a continuous-time entropy-regularized mean variance portfolio selection problem in the presence of jumps. We propose an exploratory SDE for…
In the Bayesian reinforcement learning (RL) setting, a prior distribution over the unknown problem parameters -- the rewards and transitions -- is assumed, and a policy that optimizes the (posterior) expected return is sought. A common…
Entropy regularization is known to improve exploration in sequential decision-making problems. We show that this same mechanism can also lead to nearly unbiased and lower-variance estimates of the mean reward in the optimize-and-estimate…
This paper investigates optimal portfolio strategies in a financial market where the drift of the stock returns is driven by an unobserved Gaussian mean reverting process. Information on this process is obtained from observing stock returns…
The generalisation and robustness properties of policies learnt through Maximum-Entropy Reinforcement Learning are investigated on chaotic dynamical systems with Gaussian noise on the observable. First, the robustness under noise…
We approach the continuous-time mean-variance (MV) portfolio selection with reinforcement learning (RL). The problem is to achieve the best tradeoff between exploration and exploitation, and is formulated as an entropy-regularized, relaxed…
Addressing uncertainty is critical for autonomous systems to robustly adapt to the real world. We formulate the problem of model uncertainty as a continuous Bayes-Adaptive Markov Decision Process (BAMDP), where an agent maintains a…
We revisit Markowitz's mean-variance portfolio selection model by considering a distributionally robust version, where the region of distributional uncertainty is around the empirical measure and the discrepancy between probability measures…
We study the problem of optimal portfolio selection under stochastic volatility within a continuous time reinforcement learning framework with portfolio constraints. Exploration is modeled through entropy-regularized relaxed controls, where…
A large class of stochastic programs involve optimizing an expectation taken with respect to an underlying distribution that is unknown in practice. One popular approach to addressing the distributional uncertainty, known as the…
Sample-based trajectory optimisers are a promising tool for the control of robotics with non-differentiable dynamics and cost functions. Contemporary approaches derive from a restricted subclass of stochastic optimal control where the…
In this paper, both dynamic mean-variance portfolio selection problems and dynamic variance hedging problems are discussed under non-Markovian framework. Explicit closed-loop equilibrium strategies of these problems are respectively…
Classical mean-variance portfolio theory tells us how to construct a portfolio of assets which has the greatest expected return for a given level of return volatility. Utility theory then allows an investor to choose the point along this…
We consider the minimization over probability measures of the expected value of a random variable, regularized by relative entropy with respect to a given probability distribution. In the general setting we provide a complete…
The paper solves the problem of optimal portfolio choice when the parameters of the asset returns distribution, like the mean vector and the covariance matrix are unknown and have to be estimated by using historical data of the asset…
Bayesian optimization is a coherent, ubiquitous approach to decision-making under uncertainty, with applications including multi-arm bandits, active learning, and black-box optimization. Bayesian optimization selects decisions (i.e.…
The entropy regularization is inspired by information entropy from machine learning and the ideas of exploration and exploitation in reinforcement learning, which appears in the control problem to design an approximating algorithm for the…