Related papers: Visualizing MuZero Models

Improving Model-Based Reinforcement Learning with Internal State Representations through Self-Supervision

Using a model of the environment, reinforcement learning agents can plan their future moves and achieve superhuman performance in board games like Chess, Shogi, and Go, while remaining relatively sample-efficient. As demonstrated by the…

Machine Learning · Computer Science 2022-01-19 Julien Scholz , Cornelius Weber , Muhammad Burhan Hafez , Stefan Wermter

What model does MuZero learn?

Model-based reinforcement learning (MBRL) has drawn considerable interest in recent years, given its promise to improve sample efficiency. Moreover, when using deep-learned models, it is possible to learn compact and generalizable models…

Machine Learning · Computer Science 2024-10-15 Jinke He , Thomas M. Moerland , Joery A. de Vries , Frans A. Oliehoek

Continuous Control for Searching and Planning with a Learned Model

Decision-making agents with planning capabilities have achieved huge success in the challenging domain like Chess, Shogi, and Go. In an effort to generalize the planning ability to the more general tasks where the environment dynamics are…

Artificial Intelligence · Computer Science 2020-06-23 Xuxi Yang , Werner Duvaud , Peng Wei

Demystifying MuZero Planning: Interpreting the Learned Model

MuZero has achieved superhuman performance in various games by using a dynamics network to predict the environment dynamics for planning, without relying on simulators. However, the latent states learned by the dynamics network make its…

Artificial Intelligence · Computer Science 2025-07-18 Hung Guei , Yan-Ru Ju , Wei-Yu Chen , Ti-Rong Wu

Equivariant MuZero

Deep reinforcement learning repeatedly succeeds in closed, well-defined domains such as games (Chess, Go, StarCraft). The next frontier is real-world scenarios, where setups are numerous and varied. For this, agents need to learn the…

Machine Learning · Computer Science 2023-02-10 Andreea Deac , Théophane Weber , George Papamakarios

Calibrated Value-Aware Model Learning with Probabilistic Environment Models

The idea of value-aware model learning, that models should produce accurate value estimates, has gained prominence in model-based reinforcement learning. The MuZero loss, which penalizes a model's value function prediction compared to the…

Machine Learning · Computer Science 2025-06-10 Claas Voelcker , Anastasiia Pedan , Arash Ahmadian , Romina Abachi , Igor Gilitschenski , Amir-massoud Farahmand

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess and Go, where a…

Machine Learning · Computer Science 2021-01-27 Julian Schrittwieser , Ioannis Antonoglou , Thomas Hubert , Karen Simonyan , Laurent Sifre , Simon Schmitt , Arthur Guez , Edward Lockhart , Demis Hassabis , Thore Graepel , Timothy Lillicrap , David Silver

Learning and Planning in Complex Action Spaces

Many important real-world problems have action spaces that are high-dimensional, continuous or both, making full enumeration of all possible actions infeasible. Instead, only small subsets of actions can be sampled for the purpose of policy…

Machine Learning · Computer Science 2021-04-14 Thomas Hubert , Julian Schrittwieser , Ioannis Antonoglou , Mohammadamin Barekatain , Simon Schmitt , David Silver

UniZero: Generalized and Efficient Planning with Scalable Latent World Models

Learning predictive world models is crucial for enhancing the planning capabilities of reinforcement learning (RL) agents. Recently, MuZero-style algorithms, leveraging the value equivalence principle and Monte Carlo Tree Search (MCTS),…

Machine Learning · Computer Science 2025-01-06 Yuan Pu , Yazhe Niu , Zhenjie Yang , Jiyuan Ren , Hongsheng Li , Yu Liu

Mastering the Game of Go with Self-play Experience Replay

The game of Go has long served as a benchmark for artificial intelligence, demanding sophisticated strategic reasoning and long-term planning. Previous approaches such as AlphaGo and its successors, have predominantly relied on model-based…

Artificial Intelligence · Computer Science 2026-01-08 Jingbin Liu , Xuechun Wang

On the role of planning in model-based deep reinforcement learning

Model-based planning is often thought to be necessary for deep, careful reasoning and generalization in artificial agents. While recent successes of model-based reinforcement learning (MBRL) with deep function approximation have…

Artificial Intelligence · Computer Science 2021-03-18 Jessica B. Hamrick , Abram L. Friesen , Feryal Behbahani , Arthur Guez , Fabio Viola , Sims Witherspoon , Thomas Anthony , Lars Buesing , Petar Veličković , Théophane Weber

$\lambda$-models: Effective Decision-Aware Reinforcement Learning with Latent Models

The idea of decision-aware model learning, that models should be accurate where it matters for decision-making, has gained prominence in model-based reinforcement learning. While promising theoretical results have been established, the…

Machine Learning · Computer Science 2024-03-04 Claas A Voelcker , Arash Ahmadian , Romina Abachi , Igor Gilitschenski , Amir-massoud Farahmand

OptionZero: Planning with Learned Options

Planning with options -- a sequence of primitive actions -- has been shown effective in reinforcement learning within complex environments. Previous studies have focused on planning with predefined options or learned options through expert…

Artificial Intelligence · Computer Science 2025-03-24 Po-Wei Huang , Pei-Chiun Peng , Hung Guei , Ti-Rong Wu

An AlphaZero-Inspired Approach to Solving Search Problems

AlphaZero and its extension MuZero are computer programs that use machine-learning techniques to play at a superhuman level in chess, go, and a few other games. They achieved this level of play solely with reinforcement learning from…

Artificial Intelligence · Computer Science 2022-07-05 Evgeny Dantsin , Vladik Kreinovich , Alexander Wolpert

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

Games are abstractions of the real world, where artificial agents learn to compete and cooperate with other agents. While significant achievements have been made in various perfect- and imperfect-information games, DouDizhu (a.k.a. Fighting…

Artificial Intelligence · Computer Science 2021-06-14 Daochen Zha , Jingru Xie , Wenye Ma , Sheng Zhang , Xiangru Lian , Xia Hu , Ji Liu

Representation Matters for Mastering Chess: Improved Feature Representation in AlphaZero Outperforms Switching to Transformers

While transformers have gained recognition as a versatile tool for artificial intelligence (AI), an unexplored challenge arises in the context of chess - a classical AI benchmark. Here, incorporating Vision Transformers (ViTs) into…

Artificial Intelligence · Computer Science 2024-08-21 Johannes Czech , Jannis Blüml , Kristian Kersting , Hedinn Steingrimsson

SkyNet: Belief-Aware Planning for Partially-Observable Stochastic Games

In 2019, Google DeepMind released MuZero, a model-based reinforcement learning method that achieves strong results in perfect-information games by combining learned dynamics models with Monte Carlo Tree Search (MCTS). However, comparatively…

Artificial Intelligence · Computer Science 2026-03-31 Adam Haile

Fuzzy Ensembles of Reinforcement Learning Policies for Robotic Systems with Varied Parameters

Reinforcement Learning (RL) is an emerging approach to control many dynamical systems for which classical control approaches are not applicable or insufficient. However, the resultant policies may not generalize to variations in the…

Robotics · Computer Science 2023-11-13 Abdel Gafoor Haddad , Mohammed B. Mohiuddin , Igor Boiko , Yahya Zweiri

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning

Deep reinforcement learning (RL) algorithms suffer severe performance degradation when the interaction data is scarce, which limits their real-world application. Recently, visual representation learning has been shown to be effective and…

Machine Learning · Computer Science 2022-08-17 Yang Yue , Bingyi Kang , Zhongwen Xu , Gao Huang , Shuicheng Yan

Value Driven Representation for Human-in-the-Loop Reinforcement Learning

Interactive adaptive systems powered by Reinforcement Learning (RL) have many potential applications, such as intelligent tutoring systems. In such systems there is typically an external human system designer that is creating, monitoring…

Artificial Intelligence · Computer Science 2020-04-06 Ramtin Keramati , Emma Brunskill