English

CrowdAgent: Multi-Agent Managed Multi-Source Annotation System

Artificial Intelligence 2025-09-18 v1

Abstract

High-quality annotated data is a cornerstone of modern Natural Language Processing (NLP). While recent methods begin to leverage diverse annotation sources-including Large Language Models (LLMs), Small Language Models (SLMs), and human experts-they often focus narrowly on the labeling step itself. A critical gap remains in the holistic process control required to manage these sources dynamically, addressing complex scheduling and quality-cost trade-offs in a unified manner. Inspired by real-world crowdsourcing companies, we introduce CrowdAgent, a multi-agent system that provides end-to-end process control by integrating task assignment, data annotation, and quality/cost management. It implements a novel methodology that rationally assigns tasks, enabling LLMs, SLMs, and human experts to advance synergistically in a collaborative annotation workflow. We demonstrate the effectiveness of CrowdAgent through extensive experiments on six diverse multimodal classification tasks. The source code and video demo are available at https://github.com/QMMMS/CrowdAgent.

Keywords

Cite

@article{arxiv.2509.14030,
  title  = {CrowdAgent: Multi-Agent Managed Multi-Source Annotation System},
  author = {Maosheng Qin and Renyu Zhu and Mingxuan Xia and Chenkai Chen and Zhen Zhu and Minmin Lin and Junbo Zhao and Lu Xu and Changjie Fan and Runze Wu and Haobo Wang},
  journal= {arXiv preprint arXiv:2509.14030},
  year   = {2025}
}