English
Related papers

Related papers: Queueing, Predictions, and LLMs: Challenges and Op…

200 papers

Software systems usually provide numerous configuration options that can affect performance metrics such as execution time, memory usage, binary size, or bitrate. On the one hand, making informed decisions is challenging and requires domain…

Large language model (LLM) serving is becoming an increasingly critical workload for cloud providers. Existing LLM serving systems focus on interactive requests, such as chatbots and coding assistants, with tight latency SLO requirements.…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-26 Archit Patke , Dhemath Reddy , Saurabh Jha , Haoran Qiu , Christian Pinto , Chandra Narayanaswami , Zbigniew Kalbarczyk , Ravishankar Iyer

Machine learning can provide deep insights into data, allowing machines to make high-quality predictions and having been widely used in real-world applications, such as text mining, visual classification, and recommender systems. However,…

Machine Learning · Computer Science 2020-08-11 Meng Wang , Weijie Fu , Xiangnan He , Shijie Hao , Xindong Wu

We study the problem of optimizing Large Language Model (LLM) inference scheduling to minimize total latency. LLM inference is an online and multi-task service process and also heavily energy consuming by which a pre-trained LLM processes…

Machine Learning · Computer Science 2025-09-03 Zixi Chen , Yinyu Ye , Zijie Zhou

We investigate the predictability of large language model (LLM) capabilities: given records of past experiments using different model families, numbers of parameters, tasks, and numbers of in-context examples, can we accurately predict LLM…

Computation and Language · Computer Science 2023-11-01 Qinyuan Ye , Harvey Yiyun Fu , Xiang Ren , Robin Jia

Automated planning is concerned with developing efficient algorithms to generate plans or sequences of actions to achieve a specific goal in a given environment. Emerging Large Language Models (LLMs) can answer questions, write high-quality…

Pre-trained large language models (LLM) have emerged as a powerful tool for simulating various scenarios and generating output given specific instructions and multimodal input. In this work, we analyze the specific use of LLM to enhance a…

Machine Learning · Computer Science 2024-05-10 Yuhang Wu , Yingfei Wang , Chu Wang , Zeyu Zheng

In Large Language Model (LLM) inference, the output length of an LLM request is typically regarded as not known a priori. Consequently, most LLM serving systems employ a simple First-come-first-serve (FCFS) scheduling strategy, leading to…

Machine Learning · Computer Science 2024-08-29 Yichao Fu , Siqi Zhu , Runlong Su , Aurick Qiao , Ion Stoica , Hao Zhang

Recent work exploring the capabilities of pre-trained large language models (LLMs) has demonstrated their ability to act as general pattern machines by completing complex token sequences representing a wide array of tasks, including…

Computers and Society · Computer Science 2024-03-25 Seyed Parsa Neshaei , Richard Lee Davis , Adam Hazimeh , Bojan Lazarevski , Pierre Dillenbourg , Tanja Käser

Large language models (LLMs) are widely applied in chatbots, code generators, and search engines. Workload such as chain-of-throught, complex reasoning, agent services significantly increase the inference cost by invoke the model…

Computation and Language · Computer Science 2025-11-27 Sihyeong Park , Sungryeol Jeon , Chaelyn Lee , Seokhun Jeon , Byung-Soo Kim , Jemin Lee

Large language models (LLMs) have been applied in many fields and have developed rapidly in recent years. As a classic machine learning task, time series forecasting has recently been boosted by LLMs. Recent works treat large language…

Computation and Language · Computer Science 2024-12-31 Hua Tang , Chong Zhang , Mingyu Jin , Qinkai Yu , Zhenting Wang , Xiaobo Jin , Yongfeng Zhang , Mengnan Du

The job shop scheduling problem (JSSP) remains a significant hurdle in optimizing production processes. This challenge involves efficiently allocating jobs to a limited number of machines while minimizing factors like total processing time…

Artificial Intelligence · Computer Science 2024-08-14 Henrik Abgaryan , Ararat Harutyunyan , Tristan Cazenave

The rapid adoption of large language models (LLMs) has created significant challenges for efficient inference at scale. Unlike traditional workloads, LLM inference is constrained by both computation and the memory overhead of key-value (KV)…

Machine Learning · Computer Science 2026-05-07 Chengyi Nie , Nian Si , Zijie Zhou

Objective: Large language models (LLMs) are attracting increasing interest in healthcare. This commentary evaluates the potential of LLMs to improve clinical prediction models (CPMs) for diagnostic and prognostic tasks, with a focus on…

Computers and Society · Computer Science 2025-11-07 Yusuf Yildiz , Goran Nenadic , Meghna Jani , David A. Jenkins

Mobility analysis is a crucial element in the research area of transportation systems. Forecasting traffic information offers a viable solution to address the conflict between increasing transportation demands and the limitations of…

Machine Learning · Computer Science 2025-02-24 Zijian Zhang , Yujie Sun , Zepu Wang , Yuqi Nie , Xiaobo Ma , Ruolin Li , Peng Sun , Xuegang Ban

Large Language Model (LLM) inference, where a trained model generates text one word at a time in response to user prompts, is a computationally intensive process requiring efficient scheduling to optimize latency and resource utilization. A…

Machine Learning · Computer Science 2026-01-16 Patrick Jaillet , Jiashuo Jiang , Konstantina Mellou , Marco Molinaro , Chara Podimata , Zijie Zhou

The planning ability of Large Language Models (LLMs) has garnered increasing attention in recent years due to their remarkable capacity for multi-step reasoning and their ability to generalize across a wide range of domains. While some…

Artificial Intelligence · Computer Science 2025-02-19 Mohamed Aghzal , Erion Plaku , Gregory J. Stein , Ziyu Yao

This is the first work to look at the application of large language models (LLMs) for the purpose of model space edits in automated planning tasks. To set the stage for this union, we explore two different flavors of model space problems…

Artificial Intelligence · Computer Science 2024-03-06 Turgay Caglar , Sirine Belhaj , Tathagata Chakraborti , Michael Katz , Sarath Sreedharan

Accurate estimation of project costs and durations remains a pivotal challenge in software engineering, directly impacting budgeting and resource management. Traditional estimation techniques, although widely utilized, often fall short due…

Software Engineering · Computer Science 2024-09-17 Justin Carpenter , Chia-Ying Wu , Nasir U. Eisty

Forecasting future events is important for policy and decision making. In this work, we study whether language models (LMs) can forecast at the level of competitive human forecasters. Towards this goal, we develop a retrieval-augmented LM…

Machine Learning · Computer Science 2024-02-29 Danny Halawi , Fred Zhang , Chen Yueh-Han , Jacob Steinhardt
‹ Prev 1 2 3 10 Next ›