Related papers: Optimizing Model Selection for Compound AI Systems

Mixture of Complementary Agents for Robust LLM Ensemble

Multi-AI collaboration, such as ensembling or debating large language models (LLMs), is a promising paradigm for aggregating information and boosting performance. A foundational step in these pipelines is to feed the responses of several…

Machine Learning · Computer Science 2026-05-26 Yichi Zhang , Kevin Lu , Yuang Zhang , Jie Gao , Lirong Xia , Fang-Yi Yu

Compound AI Systems Optimization: A Survey of Methods, Challenges, and Future Directions

Recent advancements in large language models (LLMs) and AI systems have led to a paradigm shift in the design and optimization of complex AI workflows. By integrating multiple components, compound AI systems have become increasingly adept…

Computation and Language · Computer Science 2025-10-08 Yu-Ang Lee , Guan-Ting Yi , Mei-Yi Liu , Jui-Chao Lu , Guan-Bo Yang , Yun-Nung Chen

Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems

Large Language Models (LLMs) have demonstrated exceptional capabilities, yet selecting the most reliable response from multiple LLMs remains a challenge, particularly in resource-constrained settings. Existing approaches often depend on…

Computation and Language · Computer Science 2025-10-06 Aakriti Agrawal , Rohith Aralikatti , Anirudh Satheesh , Souradip Chakraborty , Amrit Singh Bedi , Furong Huang

Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems

Many recent state-of-the-art results in language tasks were achieved using compound systems that perform multiple Language Model (LM) calls and aggregate their responses. However, there is little understanding of how the number of LM calls…

Machine Learning · Computer Science 2024-06-06 Lingjiao Chen , Jared Quincy Davis , Boris Hanin , Peter Bailis , Ion Stoica , Matei Zaharia , James Zou

SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models

Large language models (LLMs) have been widely adopted due to their remarkable performance across various applications, driving the accelerated development of a large number of diverse models. However, these individual LLMs show limitations…

Computation and Language · Computer Science 2025-06-13 Kaushal Kumar Maurya , KV Aditya Srivatsa , Ekaterina Kochmar

Don't Always Pick the Highest-Performing Model: An Information Theoretic View of LLM Ensemble Selection

Large language models (LLMs) are often ensembled together to improve overall reliability and robustness, but in practice models are strongly correlated. This raises a fundamental question: which models should be selected when forming an LLM…

Machine Learning · Computer Science 2026-02-10 Yigit Turkmen , Baturalp Buyukates , Melih Bastopcu

Cost-Aware Model Orchestration for LLM-based Systems

As modern artificial intelligence (AI) systems become more advanced and capable, they can leverage a wide range of tools and models to perform complex tasks. The task of orchestrating these models is increasingly performed by Large Language…

Artificial Intelligence · Computer Science 2026-04-20 Daria Smirnova , Hamid Nasiri , Marta Adamska , Zhengxin Yu , Peter Garraghan

Diversity of Thought Elicits Stronger Reasoning Capabilities in Multi-Agent Debate Frameworks

Large language models (LLMs) excel in natural language generation but often confidently produce incorrect responses, especially in tasks like mathematical reasoning. Chain-of-thought prompting, self-verification, and multi-agent debate are…

Computation and Language · Computer Science 2026-03-30 Mahmood Hegazy

Robust Planning with Compound LLM Architectures: An LLM-Modulo Approach

Previous work has attempted to boost Large Language Model (LLM) performance on planning and scheduling tasks through a variety of prompt engineering techniques. While these methods can work within the distributions tested, they are neither…

Computation and Language · Computer Science 2024-11-25 Atharva Gundawar , Karthik Valmeekam , Mudit Verma , Subbarao Kambhampati

Grammar Search for Multi-Agent Systems

Automatic search for Multi-Agent Systems has recently emerged as a key focus in agentic AI research. Several prior approaches have relied on LLM-based free-form search over the code space. In this work, we propose a more structured…

Artificial Intelligence · Computer Science 2025-12-17 Mayank Singh , Vikas Yadav , Shiva Krishna Reddy Malay , Shravan Nayak , Sai Rajeswar , Sathwik Tejaswi Madhusudhan , Eduardo Blanco

Leveraging LLMs as Meta-Judges: A Multi-Agent Framework for Evaluating LLM Judgments

Large language models (LLMs) are being widely applied across various fields, but as tasks become more complex, evaluating their responses is increasingly challenging. Compared to human evaluators, the use of LLMs to support performance…

Artificial Intelligence · Computer Science 2025-04-25 Yuran Li , Jama Hussein Mohamud , Chongren Sun , Di Wu , Benoit Boulet

TPS-Bench: Evaluating AI Agents' Tool Planning \& Scheduling Abilities in Compounding Tasks

Large language model (LLM) agents have exhibited strong problem-solving competence across domains like research and coding. Yet, it remains underexplored whether LLM agents can tackle compounding real-world problems that require a diverse…

Artificial Intelligence · Computer Science 2025-11-04 Hanwen Xu , Xuyao Huang , Yuzhe Liu , Kai Yu , Zhijie Deng

Leveraging Large Language Models for Collective Decision-Making

In various work contexts, such as meeting scheduling, collaborating, and project planning, collective decision-making is essential but often challenging due to diverse individual preferences, varying work focuses, and power dynamics among…

Computation and Language · Computer Science 2025-08-13 Marios Papachristou , Longqi Yang , Chin-Chia Hsu

A Blueprint Architecture of Compound AI Systems for Enterprise

Large Language Models (LLMs) have showcased remarkable capabilities surpassing conventional NLP challenges, creating opportunities for use in production use cases. Towards this goal, there is a notable shift to building compound AI systems,…

Databases · Computer Science 2024-06-04 Eser Kandogan , Sajjadur Rahman , Nikita Bhutani , Dan Zhang , Rafael Li Chen , Kushan Mitra , Sairam Gurajada , Pouya Pezeshkpour , Hayate Iso , Yanlin Feng , Hannah Kim , Chen Shen , Jin Wang , Estevam Hruschka

Enhancing LLM Code Generation with Ensembles: A Similarity-Based Selection Approach

Ensemble learning has been widely used in machine learning to improve model robustness, accuracy, and generalization, but has not yet been applied to code generation tasks with large language models (LLMs). We propose an ensemble approach…

Software Engineering · Computer Science 2025-07-22 Tarek Mahmud , Bin Duan , Corina Pasareanu , Guowei Yang

LLM-TOPLA: Efficient LLM Ensemble by Maximising Diversity

Combining large language models during training or at inference time has shown substantial performance gain over component LLMs. This paper presents LLM-TOPLA, a diversity-optimized LLM ensemble method with three unique properties: (i) We…

Computation and Language · Computer Science 2024-10-08 Selim Furkan Tekin , Fatih Ilhan , Tiansheng Huang , Sihao Hu , Ling Liu

Improving Cooperation in Collaborative Embodied AI

The integration of Large Language Models (LLMs) into multiagent systems has opened new possibilities for collaborative reasoning and cooperation with AI agents. This paper explores different prompting methods and evaluates their…

Artificial Intelligence · Computer Science 2025-10-06 Hima Jacob Leven Suprabha , Laxmi Nag Laxminarayan Nagesh , Ajith Nair , Alvin Reuben Amal Selvaster , Ayan Khan , Raghuram Damarla , Sanju Hannah Samuel , Sreenithi Saravana Perumal , Titouan Puech , Venkataramireddy Marella , Vishal Sonar , Alessandro Suglia , Oliver Lemon

The Wisdom of Deliberating AI Crowds: Does Deliberation Improve LLM-Based Forecasting?

Structured deliberation has been found to improve the performance of human forecasters. This study investigates whether a similar intervention, i.e. allowing LLMs to review each other's forecasts before updating, can improve accuracy in…

Artificial Intelligence · Computer Science 2025-12-30 Paul Schneider , Amalie Schramm

Efficient Leave-one-out Approximation in LLM Multi-agent Debate Based on Introspection

Multi-agent systems based on large language models (LLMs) advance automatic task completion in various fields, where debate is a common cooperation form for agents to solve complicated problems with reasoning and cross-review to solidify…

Multiagent Systems · Computer Science 2025-05-29 Yue Cui , Liuyi Yao , Zitao Li , Yaliang Li , Bolin Ding , Xiaofang Zhou

A Query Optimization Method Utilizing Large Language Models

Query optimization is a critical task in database systems, focused on determining the most efficient way to execute a query from an enormous set of possible strategies. Traditional approaches rely on heuristic search methods and cost…

Databases · Computer Science 2025-03-11 Zhiming Yao , Haoyang Li , Jing Zhang , Cuiping Li , Hong Chen