English

Learning and Improving Backgammon Strategy

Machine Learning 2025-04-04 v1 Artificial Intelligence Neural and Evolutionary Computing

Abstract

A novel approach to learning is presented, combining features of on-line and off-line methods to achieve considerable performance in the task of learning a backgammon value function in a process that exploits the processing power of parallel supercomputers. The off-line methods comprise a set of techniques for parallelizing neural network training and TD(λ)TD(\lambda) reinforcement learning; here Monte-Carlo ``Rollouts'' are introduced as a massively parallel on-line policy improvement technique which applies resources to the decision points encountered during the search of the game tree to further augment the learned value function estimate. A level of play roughly as good as, or possibly better than, the current champion human and computer backgammon players has been achieved in a short period of learning.

Keywords

Cite

@article{arxiv.2504.02221,
  title  = {Learning and Improving Backgammon Strategy},
  author = {Gregory R. Galperin},
  journal= {arXiv preprint arXiv:2504.02221},
  year   = {2025}
}

Comments

Accompanied by oral presentation by Gregory Galperin at the CBCL Learning Day 1994

R2 v1 2026-06-28T22:44:41.628Z