TextArena

Leon Guertler; Bobby Cheng; Simon Yu; Bo Liu; Leshem Choshen; Cheston Tan

TextArena

Computation and Language 2025-05-27 v2 Artificial Intelligence Machine Learning Multiagent Systems

Authors: Leon Guertler , Bobby Cheng , Simon Yu , Bo Liu , Leshem Choshen , Cheston Tan

Abstract

TextArena is an open-source collection of competitive text-based games for training and evaluation of agentic behavior in Large Language Models (LLMs). It spans 57+ unique environments (including single-player, two-player, and multi-player setups) and allows for easy evaluation of model capabilities via an online-play system (against humans and other submitted models) with real-time TrueSkill scores. Traditional benchmarks rarely assess dynamic social skills such as negotiation, theory of mind, and deception, creating a gap that TextArena addresses. Designed with research, community and extensibility in mind, TextArena emphasizes ease of adding new games, adapting the framework, testing models, playing against the models, and training models. Detailed documentation of environments, games, leaderboard, and examples are available on https://github.com/LeonGuertler/TextArena and https://www.textarena.ai/.

Keywords

benchmarking large language model evaluation procedural content generation

Cite

@article{arxiv.2504.11442,
  title  = {TextArena},
  author = {Leon Guertler and Bobby Cheng and Simon Yu and Bo Liu and Leshem Choshen and Cheston Tan},
  journal= {arXiv preprint arXiv:2504.11442},
  year   = {2025}
}

Comments

Work in progress; 5 pages, 3 figures

TextArena

Abstract

Keywords

Cite

Comments

Related papers