Multiagent Systems · Computer Science
Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence
Yuhang Song, Andrzej Wojcicki, Thomas Lukasiewicz, Jianyi Wang +5
2019-12-02
Artificial Intelligence · Computer Science
WebArena: A Realistic Web Environment for Building Autonomous Agents
Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou +8
2024-04-17
Robotics · Computer Science
RoboArena: Distributed Real-World Evaluation of Generalist Robot Policies
Pranav Atreya, Karl Pertsch, Tony Lee, Moo Jin Kim +28
2025-12-02
Robotics · Computer Science
GRAPPA: Generalizing and Adapting Robot Policies via Online Agentic Guidance
Arthur Bucker, Pablo Ortega-Kral, Jonathan Francis, Jean Oh
2025-04-09
Machine Learning · Computer Science
Arena: a toolkit for Multi-Agent Reinforcement Learning
Qing Wang, Jiechao Xiong, Lei Han, Meng Fang +4
2019-07-24
Cryptography and Security · Computer Science
DoomArena: A framework for Testing AI Agents Against Evolving Security Threats
Leo Boisvert, Mihir Bansal, Chandra Kiran Reddy Evuru, Gabriel Huang +8
2025-10-08
Computation and Language · Computer Science
DR-Arena: an Automated Evaluation Framework for Deep Research Agents
Yiwen Gao, Ruochen Zhao, Yang Deng, Wenxuan Zhang
2026-01-16
Computation and Language · Computer Science
AgentKernelArena: Generalization-Aware Benchmarking of GPU Kernel Optimization Agents
Sharareh Younesian, Wenwen Ouyang, Sina Rafati, Mehdi Rezagholizadeh +10
2026-05-19
Robotics · Computer Science
RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation
Yash Jangir, Yidi Zhang, Pang-Chi Lo, Kashu Yamazaki +6
2026-03-23
Robotics · Computer Science
ManipArena: Comprehensive Real-world Evaluation of Reasoning-Oriented Generalist Robot Manipulation
Yu Sun, Meng Cao, Ping Yang, Rongtao Xu +14
2026-03-31
Robotics · Computer Science
Arena-Web -- A Web-based Development and Benchmarking Platform for Autonomous Navigation Approaches
Linh Kästner, Reyk Carstens, Christopher Liebig, Volodymyr Shcherbyna +3
2023-02-07
Artificial Intelligence · Computer Science
PaperArena: An Evaluation Benchmark for Tool-Augmented Agentic Reasoning on Scientific Literature
Daoyu Wang, Mingyue Cheng, Shuo Yu, Zirui Liu +3
2026-02-02
Machine Learning · Computer Science
Unity: A General Platform for Intelligent Agents
Arthur Juliani, Vincent-Pierre Berges, Ervin Teng, Andrew Cohen +7
2020-05-07
Artificial Intelligence · Computer Science
FieldWorkArena: Agentic AI Benchmark for Real Field Work Tasks
Jun Takahashi, Atsunori Moteki, Akiyoshi Uchida, Shoichi Masui +10
2026-04-16
Machine Learning · Computer Science
ClawArena: Benchmarking AI Agents in Evolving Information Environments
Haonian Ji, Kaiwen Xiong, Siwei Han, Peng Xia +8
2026-05-19
Robotics · Computer Science
Ark: An Open-source Python-based Framework for Robot Learning
Magnus Dierking, Christopher E. Mower, Sarthak Das, Huang Helong +9
2025-07-15
Artificial Intelligence · Computer Science
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
Tianqing Fang, Zhisong Zhang, Xiaoyang Wang, Rui Wang +15
2026-04-23
Robotics · Computer Science
Arena-Rosnav 2.0: A Development and Benchmarking Platform for Robot Navigation in Highly Dynamic Environments
Linh Kästner, Reyk Carstens, Huajian Zeng, Jacek Kmiecik +4
2023-08-01
Computation and Language · Computer Science
Auto-Arena: Automating LLM Evaluations with Agent Peer Battles and Committee Discussions
Ruochen Zhao, Wenxuan Zhang, Yew Ken Chia, Weiwen Xu +2
2024-10-08
Software Engineering · Computer Science
ProSoftArena: Benchmarking Hierarchical Capabilities of Multimodal Agents in Professional Software Environments
Jiaxin Ai, Yukang Feng, Fanrui Zhang, Jianwen Sun +7
2026-01-07
Networking and Internet Architecture · Computer Science
NetArena: Dynamic Benchmarks for AI Agents in Network Automation
Yajie Zhou, Jiajun Ruan, Eric S. Wang, Sadjad Fouladi +3
2026-03-17