Related papers: Cost-Effective Hyperparameter Optimization for Lar…

Optimizing Large Language Model Hyperparameters for Code Generation

Large Language Models (LLMs), such as GPT models, are increasingly used in software engineering for various tasks, such as code generation, requirements management, and debugging. While automating these tasks has garnered significant…

Software Engineering · Computer Science 2024-08-21 Chetan Arora , Ahnaf Ibn Sayeed , Sherlock Licorish , Fanyu Wang , Christoph Treude

Energy Considerations of Large Language Model Inference and Efficiency Optimizations

As large language models (LLMs) scale in size and adoption, their computational and environmental costs continue to rise. Prior benchmarking efforts have primarily focused on latency reduction in idealized settings, often overlooking the…

Computation and Language · Computer Science 2025-04-25 Jared Fernandez , Clara Na , Vashisth Tiwari , Yonatan Bisk , Sasha Luccioni , Emma Strubell

Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs

Large Language Models (LLMs) exhibit impressive zero/few-shot inference and generation quality for high-resource languages (HRLs). A few of them have been trained on low-resource languages (LRLs) and give decent performance. Owing to the…

Computation and Language · Computer Science 2024-04-22 Arijit Nag , Animesh Mukherjee , Niloy Ganguly , Soumen Chakrabarti

LLM-Guided Runtime Parameter Optimization for Energy-Efficient Model Inference

Large Language Models (LLMs) have become an integral part of many real-world workflows. However, LLMs consume a lot of energy, which becomes a large concern in the scale of the demand for these tools. As LLMs become integrated into…

Software Engineering · Computer Science 2026-05-01 Katelyn Crumpacker , Dimitrios Nikolopoulos

AdaptiveLLM: A Framework for Selecting Optimal Cost-Efficient LLM for Code-Generation Based on CoT Length

While Large Language Models (LLMs) have significantly advanced code generation efficiency, they face inherent challenges in balancing performance and inference costs across diverse programming tasks. Dynamically selecting the optimal LLM…

Software Engineering · Computer Science 2025-06-13 Junhang Cheng , Fang Liu , Chengru Wu , Li Zhang

System-performance and cost modeling of Large Language Model training and inference

Large language models (LLMs), based on transformer architectures, have revolutionized numerous domains within artificial intelligence, science, and engineering due to their exceptional scalability and adaptability. However, the exponential…

Hardware Architecture · Computer Science 2025-07-04 Wenzhe Guo , Joyjit Kundu , Uras Tos , Weijiang Kong , Giuliano Sisto , Timon Evenblij , Manu Perumkunnil

Using Large Language Models for Hyperparameter Optimization

This paper explores the use of foundational large language models (LLMs) in hyperparameter optimization (HPO). Hyperparameters are critical in determining the effectiveness of machine learning models, yet their optimization often relies on…

Machine Learning · Computer Science 2024-11-12 Michael R. Zhang , Nishkrit Desai , Juhan Bae , Jonathan Lorraine , Jimmy Ba

A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency

Large language models (LLMs) are widely applied in chatbots, code generators, and search engines. Workload such as chain-of-throught, complex reasoning, agent services significantly increase the inference cost by invoke the model…

Computation and Language · Computer Science 2025-11-27 Sihyeong Park , Sungryeol Jeon , Chaelyn Lee , Seokhun Jeon , Byung-Soo Kim , Jemin Lee

The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines

The recent surge of open-source large language models (LLMs) enables developers to create AI-based solutions while maintaining control over aspects such as privacy and compliance, thereby providing governance and ownership of the model…

Software Engineering · Computer Science 2024-08-05 Matias Martinez

Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs

Large language models (LLMs) power many state-of-the-art systems in natural language processing. However, these models are extremely computationally expensive, even at inference time, raising the natural question: when is the extra cost of…

Machine Learning · Computer Science 2023-05-05 Deepak Narayanan , Keshav Santhanam , Peter Henderson , Rishi Bommasani , Tony Lee , Percy Liang

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

There is a rapidly growing number of large language models (LLMs) that users can query for a fee. We review the cost associated with querying popular LLM APIs, e.g. GPT-4, ChatGPT, J1-Jumbo, and find that these models have heterogeneous…

Machine Learning · Computer Science 2023-05-10 Lingjiao Chen , Matei Zaharia , James Zou

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

The high computational and memory requirements of large language model (LLM) inference make it feasible only with multiple high-end accelerators. Motivated by the emerging demand for latency-insensitive tasks with batched processing, this…

Machine Learning · Computer Science 2023-06-13 Ying Sheng , Lianmin Zheng , Binhang Yuan , Zhuohan Li , Max Ryabinin , Daniel Y. Fu , Zhiqiang Xie , Beidi Chen , Clark Barrett , Joseph E. Gonzalez , Percy Liang , Christopher Ré , Ion Stoica , Ce Zhang

An investigation on the use of Large Language Models for hyperparameter tuning in Evolutionary Algorithms

Hyperparameter optimization is a crucial problem in Evolutionary Computation. In fact, the values of the hyperparameters directly impact the trajectory taken by the optimization process, and their choice requires extensive reasoning by…

Neural and Evolutionary Computing · Computer Science 2024-08-06 Leonardo Lucio Custode , Fabio Caraffini , Anil Yaman , Giovanni Iacca

Generating Data for Symbolic Language with Large Language Models

While large language models (LLMs) bring not only performance but also complexity, recent work has started to turn LLMs into data generators rather than task inferencers, where another affordable task model is trained for efficient…

Computation and Language · Computer Science 2023-05-24 Jiacheng Ye , Chengzu Li , Lingpeng Kong , Tao Yu

Towards Sustainable NLP: Insights from Benchmarking Inference Energy in Large Language Models

Large language models (LLMs) are increasingly recognized for their exceptional generative capabilities and versatility across various tasks. However, the high inference costs associated with these models have not received adequate…

Computation and Language · Computer Science 2025-03-18 Soham Poddar , Paramita Koley , Janardan Misra , Sanjay Podder , Niloy Ganguly , Saptarshi Ghosh

From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

One of the most striking findings in modern research on large language models (LLMs) is that scaling up compute during training leads to better results. However, less attention has been given to the benefits of scaling compute during…

Computation and Language · Computer Science 2024-11-21 Sean Welleck , Amanda Bertsch , Matthew Finlayson , Hailey Schoelkopf , Alex Xie , Graham Neubig , Ilia Kulikov , Zaid Harchaoui

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Large language models (LLMs) are central to modern natural language processing, delivering exceptional performance in various tasks. However, their substantial computational and memory requirements present challenges, especially for devices…

Computation and Language · Computer Science 2024-08-01 Keivan Alizadeh , Iman Mirzadeh , Dmitry Belenko , Karen Khatamifard , Minsik Cho , Carlo C Del Mundo , Mohammad Rastegari , Mehrdad Farajtabar

From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference

Large language models (LLMs) have exploded in popularity due to their new generative capabilities that go far beyond prior state-of-the-art. These technologies are increasingly being leveraged in various domains such as law, finance, and…

Computation and Language · Computer Science 2023-10-05 Siddharth Samsi , Dan Zhao , Joseph McDonald , Baolin Li , Adam Michaleas , Michael Jones , William Bergeron , Jeremy Kepner , Devesh Tiwari , Vijay Gadepally

Strategic Optimization and Challenges of Large Language Models in Object-Oriented Programming

In the area of code generation research, the emphasis has transitioned from crafting individual functions to developing class-level method code that integrates contextual information. This shift has brought several benchmarks such as…

Software Engineering · Computer Science 2024-08-28 Zinan Wang

SMART: Automatically Scaling Down Language Models with Accuracy Guarantees for Reduced Processing Fees

The advancement of Large Language Models (LLMs) has significantly boosted performance in natural language processing (NLP) tasks. However, the deployment of high-performance LLMs incurs substantial costs, primarily due to the increased…

Machine Learning · Computer Science 2024-03-22 Saehan Jo , Immanuel Trummer