English
Related papers

Related papers: Cost-Effective Hyperparameter Optimization for Lar…

200 papers

Large Language Models (LLMs), such as GPT models, are increasingly used in software engineering for various tasks, such as code generation, requirements management, and debugging. While automating these tasks has garnered significant…

Software Engineering · Computer Science 2024-08-21 Chetan Arora , Ahnaf Ibn Sayeed , Sherlock Licorish , Fanyu Wang , Christoph Treude

As large language models (LLMs) scale in size and adoption, their computational and environmental costs continue to rise. Prior benchmarking efforts have primarily focused on latency reduction in idealized settings, often overlooking the…

Computation and Language · Computer Science 2025-04-25 Jared Fernandez , Clara Na , Vashisth Tiwari , Yonatan Bisk , Sasha Luccioni , Emma Strubell

Large Language Models (LLMs) exhibit impressive zero/few-shot inference and generation quality for high-resource languages (HRLs). A few of them have been trained on low-resource languages (LRLs) and give decent performance. Owing to the…

Computation and Language · Computer Science 2024-04-22 Arijit Nag , Animesh Mukherjee , Niloy Ganguly , Soumen Chakrabarti

Large Language Models (LLMs) have become an integral part of many real-world workflows. However, LLMs consume a lot of energy, which becomes a large concern in the scale of the demand for these tools. As LLMs become integrated into…

Software Engineering · Computer Science 2026-05-01 Katelyn Crumpacker , Dimitrios Nikolopoulos

While Large Language Models (LLMs) have significantly advanced code generation efficiency, they face inherent challenges in balancing performance and inference costs across diverse programming tasks. Dynamically selecting the optimal LLM…

Software Engineering · Computer Science 2025-06-13 Junhang Cheng , Fang Liu , Chengru Wu , Li Zhang

Large language models (LLMs), based on transformer architectures, have revolutionized numerous domains within artificial intelligence, science, and engineering due to their exceptional scalability and adaptability. However, the exponential…

Hardware Architecture · Computer Science 2025-07-04 Wenzhe Guo , Joyjit Kundu , Uras Tos , Weijiang Kong , Giuliano Sisto , Timon Evenblij , Manu Perumkunnil

This paper explores the use of foundational large language models (LLMs) in hyperparameter optimization (HPO). Hyperparameters are critical in determining the effectiveness of machine learning models, yet their optimization often relies on…

Machine Learning · Computer Science 2024-11-12 Michael R. Zhang , Nishkrit Desai , Juhan Bae , Jonathan Lorraine , Jimmy Ba

Large language models (LLMs) are widely applied in chatbots, code generators, and search engines. Workload such as chain-of-throught, complex reasoning, agent services significantly increase the inference cost by invoke the model…

Computation and Language · Computer Science 2025-11-27 Sihyeong Park , Sungryeol Jeon , Chaelyn Lee , Seokhun Jeon , Byung-Soo Kim , Jemin Lee

The recent surge of open-source large language models (LLMs) enables developers to create AI-based solutions while maintaining control over aspects such as privacy and compliance, thereby providing governance and ownership of the model…

Software Engineering · Computer Science 2024-08-05 Matias Martinez

Large language models (LLMs) power many state-of-the-art systems in natural language processing. However, these models are extremely computationally expensive, even at inference time, raising the natural question: when is the extra cost of…

Machine Learning · Computer Science 2023-05-05 Deepak Narayanan , Keshav Santhanam , Peter Henderson , Rishi Bommasani , Tony Lee , Percy Liang

There is a rapidly growing number of large language models (LLMs) that users can query for a fee. We review the cost associated with querying popular LLM APIs, e.g. GPT-4, ChatGPT, J1-Jumbo, and find that these models have heterogeneous…

Machine Learning · Computer Science 2023-05-10 Lingjiao Chen , Matei Zaharia , James Zou

The high computational and memory requirements of large language model (LLM) inference make it feasible only with multiple high-end accelerators. Motivated by the emerging demand for latency-insensitive tasks with batched processing, this…

Hyperparameter optimization is a crucial problem in Evolutionary Computation. In fact, the values of the hyperparameters directly impact the trajectory taken by the optimization process, and their choice requires extensive reasoning by…

Neural and Evolutionary Computing · Computer Science 2024-08-06 Leonardo Lucio Custode , Fabio Caraffini , Anil Yaman , Giovanni Iacca

While large language models (LLMs) bring not only performance but also complexity, recent work has started to turn LLMs into data generators rather than task inferencers, where another affordable task model is trained for efficient…

Computation and Language · Computer Science 2023-05-24 Jiacheng Ye , Chengzu Li , Lingpeng Kong , Tao Yu

Large language models (LLMs) are increasingly recognized for their exceptional generative capabilities and versatility across various tasks. However, the high inference costs associated with these models have not received adequate…

Computation and Language · Computer Science 2025-03-18 Soham Poddar , Paramita Koley , Janardan Misra , Sanjay Podder , Niloy Ganguly , Saptarshi Ghosh

One of the most striking findings in modern research on large language models (LLMs) is that scaling up compute during training leads to better results. However, less attention has been given to the benefits of scaling compute during…

Computation and Language · Computer Science 2024-11-21 Sean Welleck , Amanda Bertsch , Matthew Finlayson , Hailey Schoelkopf , Alex Xie , Graham Neubig , Ilia Kulikov , Zaid Harchaoui

Large language models (LLMs) are central to modern natural language processing, delivering exceptional performance in various tasks. However, their substantial computational and memory requirements present challenges, especially for devices…

Large language models (LLMs) have exploded in popularity due to their new generative capabilities that go far beyond prior state-of-the-art. These technologies are increasingly being leveraged in various domains such as law, finance, and…

In the area of code generation research, the emphasis has transitioned from crafting individual functions to developing class-level method code that integrates contextual information. This shift has brought several benchmarks such as…

Software Engineering · Computer Science 2024-08-28 Zinan Wang

The advancement of Large Language Models (LLMs) has significantly boosted performance in natural language processing (NLP) tasks. However, the deployment of high-performance LLMs incurs substantial costs, primarily due to the increased…

Machine Learning · Computer Science 2024-03-22 Saehan Jo , Immanuel Trummer
‹ Prev 1 2 3 10 Next ›