Related papers: OptLLM: Optimal Assignment of Queries to Large Lan…

Solving General Natural-Language-Description Optimization Problems with Large Language Models

Optimization problems seek to find the best solution to an objective under a set of constraints, and have been widely investigated in real-world applications. Modeling and solving optimization problems in a specific domain typically require…

Optimization and Control · Mathematics 2024-07-12 Jihai Zhang , Wei Wang , Siyan Guo , Li Wang , Fangquan Lin , Cheng Yang , Wotao Yin

SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models

Large language models (LLMs) have been widely adopted due to their remarkable performance across various applications, driving the accelerated development of a large number of diverse models. However, these individual LLMs show limitations…

Computation and Language · Computer Science 2025-06-13 Kaushal Kumar Maurya , KV Aditya Srivatsa , Ekaterina Kochmar

MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs

The rapid progress in machine learning (ML) has brought forth many large language models (LLMs) that excel in various tasks and areas. These LLMs come with different abilities and costs in terms of computation or pricing. Since the demand…

Machine Learning · Computer Science 2025-04-23 Quang H. Nguyen , Thinh Dao , Duy C. Hoang , Juliette Decugis , Saurav Manchanda , Nitesh V. Chawla , Khoa D. Doan

PickLLM: Context-Aware RL-Assisted Large Language Model Routing

Recently, the number of off-the-shelf Large Language Models (LLMs) has exploded with many open-source options. This creates a diverse landscape regarding both serving options (e.g., inference on local hardware vs remote LLM APIs) and model…

Machine Learning · Computer Science 2024-12-18 Dimitrios Sikeridis , Dennis Ramdass , Pranay Pareek

ThriftLLM: On Cost-Effective Selection of Large Language Models for Classification Queries

In recent years, large language models (LLMs) have demonstrated remarkable capabilities in comprehending and generating natural language content, attracting widespread attention in both industry and academia. An increasing number of…

Databases · Computer Science 2026-01-08 Keke Huang , Yimin Shi , Dujian Ding , Yifei Li , Yang Fei , Laks Lakshmanan , Xiaokui Xiao

AdaptiveLLM: A Framework for Selecting Optimal Cost-Efficient LLM for Code-Generation Based on CoT Length

While Large Language Models (LLMs) have significantly advanced code generation efficiency, they face inherent challenges in balancing performance and inference costs across diverse programming tasks. Dynamically selecting the optimal LLM…

Software Engineering · Computer Science 2025-06-13 Junhang Cheng , Fang Liu , Chengru Wu , Li Zhang

A Query Optimization Method Utilizing Large Language Models

Query optimization is a critical task in database systems, focused on determining the most efficient way to execute a query from an enormous set of possible strategies. Traditional approaches rely on heuristic search methods and cost…

Databases · Computer Science 2025-03-11 Zhiming Yao , Haoyang Li , Jing Zhang , Cuiping Li , Hong Chen

OmniRouter: Budget and Performance Controllable Multi-LLM Routing

Large language models (LLMs) deliver superior performance but require substantial computational resources and operate with relatively low efficiency, while smaller models can efficiently handle simpler tasks with fewer resources. LLM…

Databases · Computer Science 2025-12-01 Kai Mei , Wujiang Xu , Minghao Guo , Shuhang Lin , Yongfeng Zhang

AE-LLM: Adaptive Efficiency Optimization for Large Language Models

Large Language Models (LLMs) have achieved remarkable success across diverse applications, yet their deployment remains challenging due to substantial computational costs, memory requirements, and energy consumption. Recent empirical…

Machine Learning · Computer Science 2026-03-24 Kaito Tanaka , Masato Ito , Yuji Nishimura , Keisuke Matsuda , Aya Nakayama

The Case for Instance-Optimized LLMs in OLAP Databases

Large Language Models (LLMs) can enhance analytics systems with powerful data summarization, cleaning, and semantic transformation capabilities. However, deploying LLMs at scale -- processing millions to billions of rows -- remains…

Databases · Computer Science 2025-07-08 Bardia Mohammadi , Laurent Bindschaedler

ScaleLLM: A Resource-Frugal LLM Serving Framework by Optimizing End-to-End Efficiency

Large language models (LLMs) have surged in popularity and are extensively used in commercial applications, where the efficiency of model serving is crucial for the user experience. Most current research focuses on optimizing individual…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-12 Yuhang Yao , Han Jin , Alay Dilipbhai Shah , Shanshan Han , Zijian Hu , Yide Ran , Dimitris Stripelis , Zhaozhuo Xu , Salman Avestimehr , Chaoyang He

OPT-BENCH: Evaluating LLM Agent on Large-Scale Search Spaces Optimization Problems

Large Language Models (LLMs) have shown remarkable capabilities in solving diverse tasks. However, their proficiency in iteratively optimizing complex solutions through learning from previous feedback remains insufficiently explored. To…

Artificial Intelligence · Computer Science 2025-06-13 Xiaozhe Li , Jixuan Chen , Xinyu Fang , Shengyuan Ding , Haodong Duan , Qingwen Liu , Kai Chen

BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute

Large language models (LLMs) are powerful tools but are often expensive to deploy at scale. LLM query routing mitigates this by dynamically assigning queries to models of varying cost and quality to obtain a desired trade-off. Prior query…

Machine Learning · Computer Science 2025-07-01 Dujian Ding , Ankur Mallick , Shaokun Zhang , Chi Wang , Daniel Madrigal , Mirian Del Carmen Hipolito Garcia , Menglin Xia , Laks V. S. Lakshmanan , Qingyun Wu , Victor Rühle

A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency

Large language models (LLMs) are widely applied in chatbots, code generators, and search engines. Workload such as chain-of-throught, complex reasoning, agent services significantly increase the inference cost by invoke the model…

Computation and Language · Computer Science 2025-11-27 Sihyeong Park , Sungryeol Jeon , Chaelyn Lee , Seokhun Jeon , Byung-Soo Kim , Jemin Lee

Large Language Model Selection with Limited Annotations

Choosing a Large Language Model (LLM) for a given task requires comparing many strong candidates, yet standard evaluation relies on costly annotations over fixed evaluation sets. To address this challenge, we develop SELECT-LLM, the first…

Computation and Language · Computer Science 2026-05-26 Yavuz Durmazkeser , Patrik Okanovic , Andreas Kirsch , Torsten Hoefler , Nezihe Merve Gürel

New Solutions on LLM Acceleration, Optimization, and Application

Large Language Models (LLMs) have become extremely potent instruments with exceptional capacities for comprehending and producing human-like text in a wide range of applications. However, the increasing size and complexity of LLMs present…

Machine Learning · Computer Science 2024-06-18 Yingbing Huang , Lily Jiaxin Wan , Hanchen Ye , Manvi Jha , Jinghua Wang , Yuhong Li , Xiaofan Zhang , Deming Chen

Optimal Decision Making Through Scenario Simulations Using Large Language Models

The rapid evolution of Large Language Models (LLMs) has markedly expanded their application across diverse domains, transforming how complex problems are approached and solved. Initially conceived to predict subsequent words in texts, these…

Artificial Intelligence · Computer Science 2024-07-11 Sumedh Rasal , E. J. Hauer

Adaptive Budget Allocation in LLM-Augmented Surveys

Large language models (LLMs) can generate survey responses at low cost, but their reliability varies substantially across questions and is unknown before data collection. Deploying LLMs in surveys still requires costly human responses for…

Machine Learning · Computer Science 2026-04-15 Zikun Ye , Jiameng Lyu , Rui Tao

OWL: A Large Language Model for IT Operations

With the rapid development of IT operations, it has become increasingly crucial to efficiently manage and analyze large volumes of data for practical applications. The techniques of Natural Language Processing (NLP) have shown remarkable…

Computation and Language · Computer Science 2024-09-30 Hongcheng Guo , Jian Yang , Jiaheng Liu , Liqun Yang , Linzheng Chai , Jiaqi Bai , Junran Peng , Xiaorong Hu , Chao Chen , Dongfeng Zhang , Xu Shi , Tieqiao Zheng , Liangfan Zheng , Bo Zhang , Ke Xu , Zhoujun Li

Large Language Models for Supply Chain Optimization

Supply chain operations traditionally involve a variety of complex decision making problems. Over the last few decades, supply chains greatly benefited from advances in computation, which allowed the transition from manual processing to…

Artificial Intelligence · Computer Science 2023-07-14 Beibin Li , Konstantina Mellou , Bo Zhang , Jeevan Pathuri , Ishai Menache