Related papers: Generalized Maximum Entropy Differential Dynamic P…
We study expected utility maximization problem with constant relative risk aversion utility function in a complete market under the reinforcement learning framework. To induce exploration, we introduce the Tsallis entropy regularizer, which…
In this paper, we present a new class of Markov decision processes (MDPs), called Tsallis MDPs, with Tsallis entropy maximization, which generalizes existing maximum entropy reinforcement learning (RL). A Tsallis MDP provides a unified…
By using the maximum entropy principle with Tsallis entropy we obtain a fragment size distribution function which undergoes a transition to scaling. This distribution function reduces to those obtained by other authors using Shannon…
A new method is proposed for analyzing complexity and studying the information in random geometric networks using Tsallis entropy tool. Tsallis entropy of the ensemble of random geometric networks is calculated based on the components of…
In this paper, we provide a generalized framework for Variational Inference-Stochastic Optimal Control by using thenon-extensive Tsallis divergence. By incorporating the deformed exponential function into the optimality likelihood function,…
We study optimal control in models with latent factors where the agent controls the distribution over actions, rather than actions themselves, in both discrete and continuous time. To encourage exploration of the state space, we reward…
In this paper, we consider the information content of maximum ranked set sampling procedure with unequal samples (MRSSU) in terms of Tsallis entropy which is a nonadditive generalization of Shannon entropy. We obtain several results of…
This study derived the vertical distribution of streamwise velocity in wide open channels by maximizing Tsallis entropy, in accordance with the maximum entropy principle, subject to the total probability rule and the conservation of mass,…
In this paper, we investigate new procedures for statistical testing based on Tsallis entropy, a parametric generalization of Shannon entropy. Focusing on multivariate generalized Gaussian and $q$-Gaussian distributions, we develop…
In this paper, we present a novel maximum entropy formulation of the Differential Dynamic Programming algorithm and derive two variants using unimodal and multimodal value functions parameterizations. By combining the maximum entropy…
We introduce a variational algorithm based on Matrix Product States that is trained by minimizing a generalized free energy defined using Tsallis entropy instead of the standard Gibbs entropy. As a result, our model can generate the…
Within a framework of utmost generality, we show that the entropy maximization procedure with linear constraints uniquely leads to the Shannon-Boltzmann-Gibbs entropy. Therefore, the use of this procedure with linear constraints should not…
This paper studies the continuous-time reinforcement learning in jump-diffusion models by featuring the q-learning (the continuous-time counterpart of Q-learning) under Tsallis entropy regularization. Contrary to the Shannon entropy, the…
We present a novel second-order trajectory optimization algorithm based on Stein Variational Newton's Method and Maximum Entropy Differential Dynamic Programming. The proposed algorithm, called Stein Variational Differential Dynamic…
In this paper, inspired from our previous algorithm, which was based on the theory of Tsallis statistical mechanics, we develop a new evolving stochastic learning algorithm for neural networks. The new algorithm combines deterministic and…
The construction of efficient and effective decision trees remains a key topic in machine learning because of their simplicity and flexibility. A lot of heuristic algorithms have been proposed to construct near-optimal decision trees. ID3,…
Algorithmic entropy and Shannon entropy are two conceptually different information measures, as the former is based on size of programs and the later in probability distributions. However, it is known that, for any recursive probability…
Sample-based trajectory optimisers are a promising tool for the control of robotics with non-differentiable dynamics and cost functions. Contemporary approaches derive from a restricted subclass of stochastic optimal control where the…
An amended MaxEnt formulation for systems displaced from the conventional MaxEnt equilibrium is proposed. This formulation involves the minimization of the Kullback-Leibler divergence to a reference $Q$ (or maximization of Shannon…
Optimization of expensive computer models with the help of Gaussian process emulators in now commonplace. However, when several (competing) objectives are considered, choosing an appropriate sampling strategy remains an open question. We…