Related papers: MathZero, The Classification Problem, and Set-Theo…

An AlphaZero-Inspired Approach to Solving Search Problems

AlphaZero and its extension MuZero are computer programs that use machine-learning techniques to play at a superhuman level in chess, go, and a few other games. They achieved this level of play solely with reinforcement learning from…

Artificial Intelligence · Computer Science 2022-07-05 Evgeny Dantsin , Vladik Kreinovich , Alexander Wolpert

Self-Play Learning Without a Reward Metric

The AlphaZero algorithm for the learning of strategy games via self-play, which has produced superhuman ability in the games of Go, chess, and shogi, uses a quantitative reward function for game outcomes, requiring the users of the…

Machine Learning · Computer Science 2019-12-17 Dan Schmidt , Nick Moran , Jonathan S. Rosenfeld , Jonathan Rosenthal , Jonathan Yedidia

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess and Go, where a…

Machine Learning · Computer Science 2021-01-27 Julian Schrittwieser , Ioannis Antonoglou , Thomas Hubert , Karen Simonyan , Laurent Sifre , Simon Schmitt , Arthur Guez , Edward Lockhart , Demis Hassabis , Thore Graepel , Timothy Lillicrap , David Silver

Multiplayer AlphaZero

The AlphaZero algorithm has achieved superhuman performance in two-player, deterministic, zero-sum games where perfect information of the game state is available. This success has been demonstrated in Chess, Shogi, and Go where learning…

Artificial Intelligence · Computer Science 2019-12-10 Nick Petosa , Tucker Balch

Score vs. Winrate in Score-Based Games: which Reward for Reinforcement Learning?

In the last years, the DeepMind algorithm AlphaZero has become the state of the art to efficiently tackle perfect information two-player zero-sum games with a win/lose outcome. However, when the win/lose outcome is decided by a final score…

Artificial Intelligence · Computer Science 2023-01-10 Luca Pasqualini , Gianluca Amato , Marco Fantozzi , Rosa Gini , Alessandro Marchetti , Carlo Metta , Francesco Morandin , Maurizio Parton

Mastering the Game of Go with Self-play Experience Replay

The game of Go has long served as a benchmark for artificial intelligence, demanding sophisticated strategic reasoning and long-term planning. Previous approaches such as AlphaGo and its successors, have predominantly relied on model-based…

Artificial Intelligence · Computer Science 2026-01-08 Jingbin Liu , Xuechun Wang

OptionZero: Planning with Learned Options

Planning with options -- a sequence of primitive actions -- has been shown effective in reinforcement learning within complex environments. Previous studies have focused on planning with predefined options or learned options through expert…

Artificial Intelligence · Computer Science 2025-03-24 Po-Wei Huang , Pei-Chiun Peng , Hung Guei , Ti-Rong Wu

Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess

It is non-trivial to design engaging and balanced sets of game rules. Modern chess has evolved over centuries, but without a similar recourse to history, the consequences of rule changes to game dynamics are difficult to predict. AlphaZero…

Artificial Intelligence · Computer Science 2020-09-16 Nenad Tomašev , Ulrich Paquet , Demis Hassabis , Vladimir Kramnik

From Gameplay to Symbolic Reasoning: Learning SAT Solver Heuristics in the Style of Alpha(Go) Zero

Despite the recent successes of deep neural networks in various fields such as image and speech recognition, natural language processing, and reinforcement learning, we still face big challenges in bringing the power of numeric optimization…

Artificial Intelligence · Computer Science 2018-02-16 Fei Wang , Tiark Rompf

Algebraic models of dependent type theory

The rules governing the essentially algebraic notion of a category with families have been observed (independently) by Steve Awodey and Marcelo Fiore to precisely match those of a representable natural transformation between presheaves.…

Category Theory · Mathematics 2021-03-11 Clive Newstead

Policy-Based Self-Competition for Planning Problems

AlphaZero-type algorithms may stop improving on single-player tasks in case the value network guiding the tree search is unable to approximate the outcome of an episode sufficiently well. One technique to address this problem is…

Machine Learning · Computer Science 2023-06-08 Jonathan Pirnay , Quirin Göttl , Jakob Burger , Dominik Gerhard Grimm

Are Dependent Types in Set Theory Feasible?

Following the types-as-sets paradigm, we present a mechanized embedding of dependent function types with a hierarchy of universes into schematic first-order logic with equality, with axiom schemas of Tarski-Grothendieck set theory. We carry…

Logic in Computer Science · Computer Science 2026-03-16 Yunsong Yang , Simon Guilloud , Viktor Kunčak

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

The game of chess is the most widely-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation…

Artificial Intelligence · Computer Science 2017-12-06 David Silver , Thomas Hubert , Julian Schrittwieser , Ioannis Antonoglou , Matthew Lai , Arthur Guez , Marc Lanctot , Laurent Sifre , Dharshan Kumaran , Thore Graepel , Timothy Lillicrap , Karen Simonyan , Demis Hassabis

The Entropy of Artificial Intelligence and a Case Study of AlphaZero from Shannon's Perspective

The recently released AlphaZero algorithm achieves superhuman performance in the games of chess, shogi and Go, which raises two open questions. Firstly, as there is a finite number of possibilities in the game, is there a quantifiable…

Artificial Intelligence · Computer Science 2018-12-18 Bo Zhang , Bin Chen , Jin-lin Peng

Introduction to Homotopy Type Theory

This is an introductory textbook to univalent mathematics and homotopy type theory, a mathematical foundation that takes advantage of the structural nature of mathematical definitions and constructions. It is common in mathematical practice…

Logic · Mathematics 2022-12-22 Egbert Rijke

Classifying different criteria for learning algebraic structures

In the last years there has been a growing interest in the study of learning problems associated with algebraic structures. The framework we use models the scenario in which a learner is given larger and larger fragments of a structure from…

Logic · Mathematics 2024-10-31 Nikolay Bazhenov , Vittorio Cipriani , Sanjay Jain , Luca San Mauro , Frank Stephan

Set Theory in the Foundation of Math; Internal Classes and External Sets

Usual math sets have special types: countable, compact, open, occasionally Borel, rarely projective, etc. Each such set is described by a single Set Theory formula with parameters unrelated to other formulas. Exotic expressions involving…

Logic in Computer Science · Computer Science 2026-04-01 Leonid A. Levin

Impartial Games: A Challenge for Reinforcement Learning

AlphaZero-style reinforcement learning (RL) algorithms have achieved superhuman performance in many complex board games such as Chess, Shogi, and Go. However, we showcase that these algorithms encounter significant and fundamental…

Machine Learning · Computer Science 2026-01-22 Bei Zhou , Søren Riis

Hyper-Parameter Sweep on AlphaZero General

Since AlphaGo and AlphaGo Zero have achieved breakground successes in the game of Go, the programs have been generalized to solve other tasks. Subsequently, AlphaZero was developed to play Go, Chess and Shogi. In the literature, the…

Machine Learning · Computer Science 2019-03-20 Hui Wang , Michael Emmerich , Mike Preuss , Aske Plaat

AlphaBeta is not as good as you think: a simple class of synthetic games for a better analysis of deterministic game-solving algorithms

Deterministic game-solving algorithms are conventionally analyzed in the light of their average-case complexity against a distribution of random game-trees, where leaf values are independently sampled from a fixed distribution. This…

Artificial Intelligence · Computer Science 2026-02-06 Raphaël Boige , Amine Boumaza , Bruno Scherrer