English
Related papers

Related papers: PROMPT: A Fast and Extensible Memory Profiling Fra…

200 papers

Profiling tools (also known as profilers) play an important role in understanding program performance at runtime, such as hotspots, bottlenecks, and inefficiencies. While profilers have been proven to be useful, they give extra burden to…

Software Engineering · Computer Science 2025-08-06 Zhuoran Liu

Profile Guided Optimization (PGO) uses runtime profiling to direct compiler optimization decisions, effectively combining static analysis with actual execution behavior to enhance performance. Runtime profiles, collected through…

Performance · Computer Science 2025-07-23 Bingxin Liu , Yinghui Huang , Jianhua Gao , Jianjun Shi , Yongpeng Liu , Yipin Sun , Weixing Ji

In many modern LLM applications, such as retrieval augmented generation, prompts have become programs themselves. In these settings, prompt programs are repeatedly called with different user queries or data instances. A big practical…

Computation and Language · Computer Science 2024-07-01 Tobias Schnabel , Jennifer Neville

Prompt optimization aims to find the best prompt to a large language model (LLM) for a given task. LLMs have been successfully used to help find and improve prompt candidates for single-step tasks. However, realistic tasks for agents are…

Computation and Language · Computer Science 2024-10-04 Yongchao Chen , Jacob Arkin , Yilun Hao , Yang Zhang , Nicholas Roy , Chuchu Fan

A growing number of service providers are exploring methods to improve server utilization and reduce power consumption by co-scheduling high-priority latency-critical workloads with best-effort workloads. This practice requires strict…

Machine Learning · Computer Science 2023-03-28 Drew Penney , Bin Li , Jaroslaw Sydir , Lizhong Chen , Charlie Tai , Stefan Lee , Eoin Walsh , Thomas Long

With the wide adoption of language models for IR -- and specifically RAG systems -- the latency of the underlying LLM becomes a crucial bottleneck, since the long contexts of retrieved passages lead large prompts and therefore, compute…

Information Retrieval · Computer Science 2026-04-06 Cornelius Kummer , Lena Jurkschat , Michael Färber , Sahar Vahdati

Large language models (LLMs) have become increasingly capable of following instructions and complex reasoning, making prompting a flexible interface for adapting models without parameter updates. Yet prompt design remains labor-intensive…

Computation and Language · Computer Science 2026-05-22 Farima Fatahi Bayat , Moin Aminnaseri , Pouya Pezeshkpour , Estevam Hruschka

Deploying large language model (LLM)-driven conversational agents in enterprise settings requires prompts that are simultaneously correct at launch and resilient to the non-deterministic behavioral drift that characterizes production LLM…

Artificial Intelligence · Computer Science 2026-05-18 Keshava Chaitanya , Jahnavi Gundakaram

Foundation models face growing compute and memory bottlenecks, hindering deployment on resource-limited platforms. While compression techniques such as pruning and quantization are widely used, most rely on uniform heuristics that ignore…

Machine Learning · Computer Science 2025-09-09 Sadegh Jafari , Aishwarya Sarkar , Mohiuddin Bilwal , Ali Jannesari

Prompts are the interface for eliciting the capabilities of large language models (LLMs). Understanding their structure and components is critical for analyzing LLM behavior and optimizing performance. However, the field lacks a…

Computation and Language · Computer Science 2026-01-27 Sullam Jeoung , Yueyan Chen , Yi Zhang , Shuai Wang , Haibo Ding , Lin Lee Cheong

Prompt-learning has become a new paradigm in modern natural language processing, which directly adapts pre-trained language models (PLMs) to $cloze$-style prediction, autoregressive modeling, or sequence to sequence generation, resulting in…

Computation and Language · Computer Science 2021-11-04 Ning Ding , Shengding Hu , Weilin Zhao , Yulin Chen , Zhiyuan Liu , Hai-Tao Zheng , Maosong Sun

Prompt optimization has become crucial for enhancing the performance of large language models (LLMs) across a broad range of tasks. Although many research papers demonstrate its effectiveness, practical adoption is hindered because existing…

Computation and Language · Computer Science 2026-02-24 Tom Zehle , Timo Heiß , Moritz Schlager , Matthias Aßenmacher , Matthias Feurer

Embedded systems have proliferated in various consumer and industrial applications with the evolution of Cyber-Physical Systems and the Internet of Things. These systems are subjected to stringent constraints so that embedded software must…

Improving performance is a central concern for software developers. To locate optimization opportunities, developers rely on software profilers. However, these profilers only report where programs spent their time: optimizing that code may…

Performance · Computer Science 2016-08-15 Charlie Curtsinger , Emery D. Berger

Profiling techniques are used extensively at different parts of the computing stack to achieve many goals. One major goal is to make a piece of software execute more efficiently on a specific hardware platform, where efficiency spans…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-07 Chris Quackenbush , Mohamed Zahran

Large reasoning models (LRMs) excel at complex reasoning tasks but typically generate lengthy sequential chains-of-thought, resulting in long inference times before arriving at the final answer. To address this challenge, we introduce…

Artificial Intelligence · Computer Science 2025-12-04 Emil Biju , Shayan Talaei , Zhemin Huang , Mohammadreza Pourreza , Azalia Mirhoseini , Amin Saberi

In large language models (LLM)-based recommendation systems (LLM-RSs), accurately predicting user preferences by leveraging the general knowledge of LLMs is possible without requiring extensive training data. By converting recommendation…

Information Retrieval · Computer Science 2024-12-20 Genki Kusano , Kosuke Akimoto , Kunihiro Takeoka

Embedded Systems combine one or more processor cores with dedicated logic running on an ASIC or FPGA to meet design goals at reasonable cost. It is achieved by profiling the application with variety of aspects like performance, memory…

Performance · Computer Science 2013-12-12 Rajendra Patel , Arvind Rajwat

Large Language Models (LLMs) exhibit remarkable proficiency in addressing a diverse array of tasks within the Natural Language Processing (NLP) domain, with various prompt design strategies significantly augmenting their capabilities.…

Computation and Language · Computer Science 2024-08-05 Xiangyu Zhao , Chengqian Ma

Inspired by the dual-process theory of human cognition from \textit{Thinking, Fast and Slow}, we introduce \textbf{PRIME} (Planning and Retrieval-Integrated Memory for Enhanced Reasoning), a multi-agent reasoning framework that dynamically…

Artificial Intelligence · Computer Science 2025-11-12 Hieu Tran , Zonghai Yao , Nguyen Luong Tran , Zhichao Yang , Feiyun Ouyang , Shuo Han , Razieh Rahimi , Hong Yu
‹ Prev 1 2 3 10 Next ›