Related papers: Beyond Quantity: Trajectory Diversity Scaling for …

Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

Equipping Large Language Model (LLM) agents with domain-specific skills is critical for tackling complex tasks. Yet, manual authoring creates a severe scalability bottleneck. Conversely, automated skill generation often yields fragile or…

Artificial Intelligence · Computer Science 2026-04-28 Jingwei Ni , Yihao Liu , Xinpeng Liu , Yutao Sun , Mengyu Zhou , Pengyu Cheng , Dexin Wang , Erchao Zhao , Xiaoxi Jiang , Guanjun Jiang

AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories

Fine-tuning on agent-environment interaction trajectory data holds significant promise for surfacing generalized agent capabilities in open-source large language models (LLMs). In this work, we introduce AgentBank, by far the largest…

Computation and Language · Computer Science 2024-10-11 Yifan Song , Weimin Xiong , Xiutian Zhao , Dawei Zhu , Wenhao Wu , Ke Wang , Cheng Li , Wei Peng , Sujian Li

SynthTools: A Framework for Scaling Synthetic Tools for Agent Development

For agentic systems to use external tools to solve complex, long-horizon tasks, we need a large set of diverse and controllable tool-use environments. We introduce SynthTools, a fully LLM-based pipeline spanning the entire lifecycle:…

Artificial Intelligence · Computer Science 2026-05-28 Tommaso Castellani , Naimeng Ye , Daksh Mittal , Thomson Yen , Emmanouil Koukoumidis , William Zeng , Hongseok Namkoong

Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute

Recent advancements in software engineering agents have demonstrated promising capabilities in automating program improvements. However, their reliance on closed-source or resource-intensive models introduces significant deployment…

Software Engineering · Computer Science 2025-04-09 Yingwei Ma , Yongbin Li , Yihong Dong , Xue Jiang , Rongyu Cao , Jue Chen , Fei Huang , Binhua Li

TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation

The emergence of Large Language Models (LLMs) like ChatGPT has inspired the development of LLM-based agents capable of addressing complex, real-world tasks. However, these agents often struggle during task execution due to methodological…

Computation and Language · Computer Science 2025-01-22 Yaoxiang Wang , Zhiyong Wu , Junfeng Yao , Jinsong Su

Parallelism Meets Adaptiveness: Scalable Documents Understanding in Multi-Agent LLM Systems

Large language model (LLM) agents have shown increasing promise for collaborative task completion. However, existing multi-agent frameworks often rely on static workflows, fixed roles, and limited inter-agent communication, reducing their…

Multiagent Systems · Computer Science 2026-02-13 Chengxuan Xia , Qianye Wu , Sixuan Tian , Yilun Hao

Understanding Agent Scaling in LLM-Based Multi-Agent Systems via Diversity

LLM-based multi-agent systems (MAS) have emerged as a promising approach to tackle complex tasks that are difficult for individual LLMs. A natural strategy is to scale performance by increasing the number of agents; however, we find that…

Artificial Intelligence · Computer Science 2026-02-04 Yingxuan Yang , Chengrui Qu , Muning Wen , Laixi Shi , Ying Wen , Weinan Zhang , Adam Wierman , Shangding Gu

Optimizing Large Language Models for Dynamic Constraints through Human-in-the-Loop Discriminators

Large Language Models (LLMs) have recently demonstrated impressive capabilities across various real-world applications. However, due to the current text-in-text-out paradigm, it remains challenging for LLMs to handle dynamic and complex…

Artificial Intelligence · Computer Science 2024-10-25 Timothy Wei , Annabelle Miin , Anastasia Miin

A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?

As enthusiasm for scaling computation (data and parameters) in the pretraining era gradually diminished, test-time scaling (TTS), also referred to as ``test-time computing'' has emerged as a prominent research focus. Recent studies…

Computation and Language · Computer Science 2025-05-06 Qiyuan Zhang , Fuyuan Lyu , Zexu Sun , Lei Wang , Weixu Zhang , Wenyue Hua , Haolun Wu , Zhihan Guo , Yufei Wang , Niklas Muennighoff , Irwin King , Xue Liu , Chen Ma

Rethinking Scale: Deployment Trade-offs of Small Language Models under Agent Paradigms

Despite the impressive capabilities of large language models, their substantial computational costs, latency, and privacy risks hinder their widespread deployment in real-world applications. Small Language Models (SLMs) with fewer than 10…

Computation and Language · Computer Science 2026-04-22 Xinlin Wang , Mats Brorsson

Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents

Recent success in large multimodal models (LMMs) has sparked promising applications of agents capable of autonomously completing complex web tasks. While open-source LMM agents have made significant advances in offline evaluation…

Artificial Intelligence · Computer Science 2025-06-02 Vardaan Pahuja , Yadong Lu , Corby Rosset , Boyu Gou , Arindam Mitra , Spencer Whitehead , Yu Su , Ahmed Awadallah

Multi-Dimensional Summarization Agents with Context-Aware Reasoning over Enterprise Tables

We propose a novel framework for summarizing structured enterprise data across multiple dimensions using large language model (LLM)-based agents. Traditional table-to-text models often lack the capacity to reason across hierarchical…

Artificial Intelligence · Computer Science 2025-08-12 Amit Dhanda

Trajectory First: A Curriculum for Discovering Diverse Policies

Being able to solve a task in diverse ways makes agents more robust to task variations and less prone to local optima. In this context, constrained diversity optimization has become a useful reinforcement learning (RL) framework for…

Machine Learning · Computer Science 2026-05-13 Cornelius V. Braun , Sayantan Auddy , Marc Toussaint

ScaleMCP: Dynamic and Auto-Synchronizing Model Context Protocol Tools for LLM Agents

Recent advancements in Large Language Models (LLMs) and the introduction of the Model Context Protocol (MCP) have significantly expanded LLM agents' capability to interact dynamically with external tools and APIs. However, existing tool…

Computation and Language · Computer Science 2025-05-13 Elias Lumer , Anmol Gulati , Vamse Kumar Subbiah , Pradeep Honaganahalli Basavaraju , James A. Burke

Increasing LLM Coding Capabilities through Diverse Synthetic Coding Tasks

Large language models (LLMs) have shown impressive promise in code generation, yet their progress remains limited by the shortage of large-scale datasets that are both diverse and well-aligned with human reasoning. Most existing resources…

Machine Learning · Computer Science 2025-10-28 Amal Abed , Ivan Lukic , Jörg K. H. Franke , Frank Hutter

On the Diversity of Synthetic Data and its Impact on Training Large Language Models

The rise of Large Language Models (LLMs) has accentuated the need for diverse, high-quality pre-training data. Synthetic data emerges as a viable solution to the challenges of data scarcity and inaccessibility. While previous literature has…

Computation and Language · Computer Science 2024-10-24 Hao Chen , Abdul Waheed , Xiang Li , Yidong Wang , Jindong Wang , Bhiksha Raj , Marah I. Abdin

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs, empowering them to interact with external tools (e.g., APIs, functions) and complete various tasks in a self-directed fashion. The challenge of tool…

Artificial Intelligence · Computer Science 2024-02-19 Weizhou Shen , Chenliang Li , Hongzhan Chen , Ming Yan , Xiaojun Quan , Hehong Chen , Ji Zhang , Fei Huang

Agent Skill Acquisition for Large Language Models via CycleQD

Training large language models to acquire specific skills remains a challenging endeavor. Conventional training approaches often struggle with data distribution imbalances and inadequacies in objective functions that do not align well with…

Computation and Language · Computer Science 2025-02-18 So Kuroki , Taishi Nakamura , Takuya Akiba , Yujin Tang

DFlow: Diverse Dialogue Flow Simulation with Large Language Models

Developing language model-based dialogue agents requires effective data to train models that can follow specific task logic. However, most existing data simulation methods focus on increasing diversity in language, topics, or dialogue acts…

Computation and Language · Computer Science 2025-03-04 Wanyu Du , Song Feng , James Gung , Lijia Sun , Yi Zhang , Saab Mansour , Yanjun Qi

TsLLM: Augmenting LLMs for General Time Series Understanding and Prediction

Time series data is fundamental to decision-making across many domains including healthcare, finance, power systems, and logistics. However, analyzing this data correctly often requires incorporating unstructured contextual information,…

Machine Learning · Computer Science 2026-03-17 Felix Parker , Nimeesha Chan , Chi Zhang , Kimia Ghobadi