Related papers: SelectLLM: Query-Aware Efficient Selection Algorit…

OptLLM: Optimal Assignment of Queries to Large Language Models

Large Language Models (LLMs) have garnered considerable attention owing to their remarkable capabilities, leading to an increasing number of companies offering LLMs as services. Different LLMs achieve different performance at different…

Software Engineering · Computer Science 2024-05-27 Yueyue Liu , Hongyu Zhang , Yuantian Miao , Van-Hoang Le , Zhiqiang Li

Large Language Model Selection with Limited Annotations

Choosing a Large Language Model (LLM) for a given task requires comparing many strong candidates, yet standard evaluation relies on costly annotations over fixed evaluation sets. To address this challenge, we develop SELECT-LLM, the first…

Computation and Language · Computer Science 2026-05-26 Yavuz Durmazkeser , Patrik Okanovic , Andreas Kirsch , Torsten Hoefler , Nezihe Merve Gürel

Efficient Sequential Decision Making with Large Language Models

This paper focuses on extending the success of large language models (LLMs) to sequential decision making. Existing efforts either (i) re-train or finetune LLMs for decision making, or (ii) design prompts for pretrained LLMs. The former…

Machine Learning · Computer Science 2025-06-17 Dingyang Chen , Qi Zhang , Yinglun Zhu

PickLLM: Context-Aware RL-Assisted Large Language Model Routing

Recently, the number of off-the-shelf Large Language Models (LLMs) has exploded with many open-source options. This creates a diverse landscape regarding both serving options (e.g., inference on local hardware vs remote LLM APIs) and model…

Machine Learning · Computer Science 2024-12-18 Dimitrios Sikeridis , Dennis Ramdass , Pranay Pareek

ScaleLLM: A Resource-Frugal LLM Serving Framework by Optimizing End-to-End Efficiency

Large language models (LLMs) have surged in popularity and are extensively used in commercial applications, where the efficiency of model serving is crucial for the user experience. Most current research focuses on optimizing individual…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-12 Yuhang Yao , Han Jin , Alay Dilipbhai Shah , Shanshan Han , Zijian Hu , Yide Ran , Dimitris Stripelis , Zhaozhuo Xu , Salman Avestimehr , Chaoyang He

Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems

Large Language Models (LLMs) have demonstrated exceptional capabilities, yet selecting the most reliable response from multiple LLMs remains a challenge, particularly in resource-constrained settings. Existing approaches often depend on…

Computation and Language · Computer Science 2025-10-06 Aakriti Agrawal , Rohith Aralikatti , Anirudh Satheesh , Souradip Chakraborty , Amrit Singh Bedi , Furong Huang

Efficient Strategy for Improving Large Language Model (LLM) Capabilities

Large Language Models (LLMs) have become a milestone in the field of artificial intelligence and natural language processing. However, their large-scale deployment remains constrained by the need for significant computational resources.…

Computation and Language · Computer Science 2025-08-07 Julián Camilo Velandia Gutiérrez

MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs

The rapid progress in machine learning (ML) has brought forth many large language models (LLMs) that excel in various tasks and areas. These LLMs come with different abilities and costs in terms of computation or pricing. Since the demand…

Machine Learning · Computer Science 2025-04-23 Quang H. Nguyen , Thinh Dao , Duy C. Hoang , Juliette Decugis , Saurav Manchanda , Nitesh V. Chawla , Khoa D. Doan

AE-LLM: Adaptive Efficiency Optimization for Large Language Models

Large Language Models (LLMs) have achieved remarkable success across diverse applications, yet their deployment remains challenging due to substantial computational costs, memory requirements, and energy consumption. Recent empirical…

Machine Learning · Computer Science 2026-03-24 Kaito Tanaka , Masato Ito , Yuji Nishimura , Keisuke Matsuda , Aya Nakayama

Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning

Large Language Models (LLMs), with their remarkable ability to tackle challenging and unseen reasoning problems, hold immense potential for tabular learning, that is vital for many real-world applications. In this paper, we propose a novel…

Machine Learning · Computer Science 2024-05-07 Sungwon Han , Jinsung Yoon , Sercan O Arik , Tomas Pfister

RouteLLM: Learning to Route LLMs with Preference Data

Large language models (LLMs) exhibit impressive capabilities across a wide range of tasks, yet the choice of which model to use often involves a trade-off between performance and cost. More powerful models, though effective, come with…

Machine Learning · Computer Science 2025-02-25 Isaac Ong , Amjad Almahairi , Vincent Wu , Wei-Lin Chiang , Tianhao Wu , Joseph E. Gonzalez , M Waleed Kadous , Ion Stoica

Leveraging Large Language Models for Collective Decision-Making

In various work contexts, such as meeting scheduling, collaborating, and project planning, collective decision-making is essential but often challenging due to diverse individual preferences, varying work focuses, and power dynamics among…

Computation and Language · Computer Science 2025-08-13 Marios Papachristou , Longqi Yang , Chin-Chia Hsu

The Case for Instance-Optimized LLMs in OLAP Databases

Large Language Models (LLMs) can enhance analytics systems with powerful data summarization, cleaning, and semantic transformation capabilities. However, deploying LLMs at scale -- processing millions to billions of rows -- remains…

Databases · Computer Science 2025-07-08 Bardia Mohammadi , Laurent Bindschaedler

JudgeLM: Fine-tuned Large Language Models are Scalable Judges

Evaluating Large Language Models (LLMs) in open-ended scenarios is challenging because existing benchmarks and metrics can not measure them comprehensively. To address this problem, we propose to fine-tune LLMs as scalable judges (JudgeLM)…

Computation and Language · Computer Science 2025-03-04 Lianghui Zhu , Xinggang Wang , Xinlong Wang

Achieving Peak Performance for Large Language Models: A Systematic Review

In recent years, large language models (LLMs) have achieved remarkable success in natural language processing (NLP). LLMs require an extreme amount of parameters to attain high performance. As models grow into the trillion-parameter range,…

Computation and Language · Computer Science 2024-09-10 Zhyar Rzgar K Rostam , Sándor Szénási , Gábor Kertész

DeLLMa: Decision Making Under Uncertainty with Large Language Models

The potential of large language models (LLMs) as decision support tools is increasingly being explored in fields such as business, engineering, and medicine, which often face challenging tasks of decision-making under uncertainty. In this…

Artificial Intelligence · Computer Science 2024-10-14 Ollie Liu , Deqing Fu , Dani Yogatama , Willie Neiswanger

xLLM Technical Report

We introduce xLLM, an intelligent and efficient Large Language Model (LLM) inference framework designed for high-performance, large-scale enterprise-grade serving, with deep optimizations for diverse AI accelerators. To address these…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-04 Tongxuan Liu , Tao Peng , Peijun Yang , Xiaoyang Zhao , Xiusheng Lu , Weizhe Huang , Zirui Liu , Xiaoyu Chen , Zhiwei Liang , Jun Xiong , Donghe Jin , Minchao Zhang , Jinrong Guo , Yingxu Deng , Xu Zhang , Xianzhe Dong , Siqi Wang , Siyu Wu , Yu Wu , Zihan Tang , Yuting Zeng , Yanshu Wang , Jinguang Liu , Meng Kang , Menxin Li , Yunlong Wang , Yiming Liu , Xiaolong Ma , Yifan Wang , Yichen Zhang , Jinrun Yin , Keyang Zheng , Jiawei Yin , Jun Zhang , Ziyue Wang , Xiaobo Lin , Liangyu Liu , Liwei Lan , Yang Liu , Chunhua Peng , Han Liu , Songcheng Ren , Xuezhu Wang , Yunheng Shen , Yi Wang , Guyue Liu , Yitao Hu , Hui Chen , Tong Yang , Hailong Yang , Jing Li , Guiguang Ding , Ke Zhang

Enabling Flexible Multi-LLM Integration for Scalable Knowledge Aggregation

Large language models (LLMs) have shown remarkable promise but remain challenging to continually improve through traditional finetuning, particularly when integrating capabilities from other specialized LLMs. Popular methods like ensemble…

Computation and Language · Computer Science 2025-06-02 Zhenglun Kong , Zheng Zhan , Shiyue Hou , Yifan Gong , Xin Meng , Pengwei Sui , Peiyan Dong , Xuan Shen , Zifeng Wang , Pu Zhao , Hao Tang , Stratis Ioannidis , Yanzhi Wang

A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness

Large language models (LLMs) have demonstrated emergent abilities in text generation, question answering, and reasoning, facilitating various tasks and domains. Despite their proficiency in various tasks, LLMs like PaLM 540B and Llama-3.1…

Computation and Language · Computer Science 2024-12-31 Fali Wang , Zhiwei Zhang , Xianren Zhang , Zongyu Wu , Tzuhao Mo , Qiuhao Lu , Wanjing Wang , Rui Li , Junjie Xu , Xianfeng Tang , Qi He , Yao Ma , Ming Huang , Suhang Wang

AdaptiveLLM: A Framework for Selecting Optimal Cost-Efficient LLM for Code-Generation Based on CoT Length

While Large Language Models (LLMs) have significantly advanced code generation efficiency, they face inherent challenges in balancing performance and inference costs across diverse programming tasks. Dynamically selecting the optimal LLM…

Software Engineering · Computer Science 2025-06-13 Junhang Cheng , Fang Liu , Chengru Wu , Li Zhang