Related papers: Confidence-Driven Multi-Scale Model Selection for …

Confidence in the Reasoning of Large Language Models

There is a growing literature on reasoning by large language models (LLMs), but the discussion on the uncertainty in their responses is still lacking. Our aim is to assess the extent of confidence that LLMs have in their answers and how it…

Computation and Language · Computer Science 2024-12-23 Yudi Pawitan , Chris Holmes

Can Unconfident LLM Annotations Be Used for Confident Conclusions?

Large language models (LLMs) have shown high agreement with human raters across a variety of tasks, demonstrating potential to ease the challenges of human data collection. In computational social science (CSS), researchers are increasingly…

Computation and Language · Computer Science 2025-02-11 Kristina Gligorić , Tijana Zrnic , Cinoo Lee , Emmanuel J. Candès , Dan Jurafsky

Decision-Making with Lightweight Confidence-Aware Language Model for Autonomous Driving

Large Language Models (LLMs) and Multimodal LLMs (MLLMs) have demonstrated immense potential in autonomous driving (AD) by offering human-like reasoning and open-world generalization. However, the excessive computational overhead and high…

Robotics · Computer Science 2026-05-26 Ruoyu Yao , Ruiguo Zhong , Pei Liu , Mingxing Peng , Rui Yang , Jun Ma

A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency

Large language models (LLMs) are widely applied in chatbots, code generators, and search engines. Workload such as chain-of-throught, complex reasoning, agent services significantly increase the inference cost by invoke the model…

Computation and Language · Computer Science 2025-11-27 Sihyeong Park , Sungryeol Jeon , Chaelyn Lee , Seokhun Jeon , Byung-Soo Kim , Jemin Lee

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

There is a rapidly growing number of large language models (LLMs) that users can query for a fee. We review the cost associated with querying popular LLM APIs, e.g. GPT-4, ChatGPT, J1-Jumbo, and find that these models have heterogeneous…

Machine Learning · Computer Science 2023-05-10 Lingjiao Chen , Matei Zaharia , James Zou

SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models

Large language models (LLMs) have been widely adopted due to their remarkable performance across various applications, driving the accelerated development of a large number of diverse models. However, these individual LLMs show limitations…

Computation and Language · Computer Science 2025-06-13 Kaushal Kumar Maurya , KV Aditya Srivatsa , Ekaterina Kochmar

Efficient Hybrid Inference for LLMs: Reward-Based Token Modelling with Selective Cloud Assistance

Large language models (LLMs) are known for their exceptional performance across a range of natural language processing tasks, but their deployment comes at a high computational and financial cost. On the other hand, smaller language models…

Computation and Language · Computer Science 2024-09-24 Adarsh MS , Jithin VG , Ditto PS

A Survey on Efficient Inference for Large Language Models

Large Language Models (LLMs) have attracted extensive attention due to their remarkable performance across various tasks. However, the substantial computational and memory requirements of LLM inference pose challenges for deployment in…

Computation and Language · Computer Science 2024-07-22 Zixuan Zhou , Xuefei Ning , Ke Hong , Tianyu Fu , Jiaming Xu , Shiyao Li , Yuming Lou , Luning Wang , Zhihang Yuan , Xiuhong Li , Shengen Yan , Guohao Dai , Xiao-Ping Zhang , Yuhan Dong , Yu Wang

Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing

Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e.g., edge) devices, tend to lag behind in terms of…

Machine Learning · Computer Science 2024-04-24 Dujian Ding , Ankur Mallick , Chi Wang , Robert Sim , Subhabrata Mukherjee , Victor Ruhle , Laks V. S. Lakshmanan , Ahmed Hassan Awadallah

Method-Based Reasoning for Large Language Models: Extraction, Reuse, and Continuous Improvement

Large language models (LLMs) have shown impressive capabilities across a wide range of language tasks. However, their reasoning process is primarily guided by statistical patterns in training data, which limits their ability to handle novel…

Artificial Intelligence · Computer Science 2025-08-21 Hong Su

Fact-Checking with Large Language Models via Probabilistic Certainty and Consistency

Large language models (LLMs) are increasingly used in applications requiring factual accuracy, yet their outputs often contain hallucinated responses. While fact-checking can mitigate these errors, existing methods typically retrieve…

Computation and Language · Computer Science 2026-01-07 Haoran Wang , Maryam Khalid , Qiong Wu , Jian Gao , Cheng Cao

Towards Sustainable NLP: Insights from Benchmarking Inference Energy in Large Language Models

Large language models (LLMs) are increasingly recognized for their exceptional generative capabilities and versatility across various tasks. However, the high inference costs associated with these models have not received adequate…

Computation and Language · Computer Science 2025-03-18 Soham Poddar , Paramita Koley , Janardan Misra , Sanjay Podder , Niloy Ganguly , Saptarshi Ghosh

Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems

Large Language Models (LLMs) have demonstrated exceptional capabilities, yet selecting the most reliable response from multiple LLMs remains a challenge, particularly in resource-constrained settings. Existing approaches often depend on…

Computation and Language · Computer Science 2025-10-06 Aakriti Agrawal , Rohith Aralikatti , Anirudh Satheesh , Souradip Chakraborty , Amrit Singh Bedi , Furong Huang

LLM4FS: Leveraging Large Language Models for Feature Selection

Recent advances in large language models (LLMs) have provided new opportunities for decision-making, particularly in the task of automated feature selection. In this paper, we first comprehensively evaluate LLM-based feature selection…

Machine Learning · Computer Science 2025-12-12 Jianhao Li , Xianchao Xiu

Task Scheduling for Efficient Inference of Large Language Models on Single Moderate GPU Systems

Large language models~(LLMs) are known for their high demand on computing resources and memory due to their substantial model size, which leads to inefficient inference on moderate GPU systems. Techniques like quantization or pruning can…

Computational Engineering, Finance, and Science · Computer Science 2024-11-26 Wenxiang Lin , Xinglin Pan , Shaohuai Shi , Xuan Wang , Xiaowen Chu

Learning to Trust the Crowd: A Multi-Model Consensus Reasoning Engine for Large Language Models

Large language models (LLMs) achieve strong average performance yet remain unreliable at the instance level, with frequent hallucinations, brittle failures, and poorly calibrated confidence. We study reliability through the lens of…

Artificial Intelligence · Computer Science 2026-01-13 Pranav Kallem

A Survey of Confidence Estimation and Calibration in Large Language Models

Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks in various domains. Despite their impressive performance, they can be unreliable due to factual errors in their generations. Assessing their…

Computation and Language · Computer Science 2024-03-26 Jiahui Geng , Fengyu Cai , Yuxia Wang , Heinz Koeppl , Preslav Nakov , Iryna Gurevych

MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs

The rapid progress in machine learning (ML) has brought forth many large language models (LLMs) that excel in various tasks and areas. These LLMs come with different abilities and costs in terms of computation or pricing. Since the demand…

Machine Learning · Computer Science 2025-04-23 Quang H. Nguyen , Thinh Dao , Duy C. Hoang , Juliette Decugis , Saurav Manchanda , Nitesh V. Chawla , Khoa D. Doan

Leveraging Large Language Models for Predicting Cost and Duration in Software Engineering Projects

Accurate estimation of project costs and durations remains a pivotal challenge in software engineering, directly impacting budgeting and resource management. Traditional estimation techniques, although widely utilized, often fall short due…

Software Engineering · Computer Science 2024-09-17 Justin Carpenter , Chia-Ying Wu , Nasir U. Eisty

Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques

Recent progress in Language Models (LMs) has dramatically advanced the field of natural language processing (NLP), excelling at tasks like text generation, summarization, and question answering. However, their inference remains…

Machine Learning · Computer Science 2025-06-10 Adarsh Prasad Behera , Jaya Prakash Champati , Roberto Morabito , Sasu Tarkoma , James Gross