Related papers: MobileKernelBench: Can LLMs Write Efficient Kernel…

MultiKernelBench: A Multi-Platform Benchmark for Kernel Generation

The automatic generation of deep learning (DL) kernels using large language models (LLMs) has emerged as a promising approach to reduce the manual effort and hardware-specific expertise required for writing high-performance operator…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-29 Zhongzhen Wen , Yinghui Zhang , Zhong Li , Zhongxin Liu , Linna Xie , Tian Zhang

MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases

The deployment of Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices has gained significant attention due to the benefits of enhanced privacy, stability, and personalization. However, the hardware constraints…

Computation and Language · Computer Science 2024-06-18 Rithesh Murthy , Liangwei Yang , Juntao Tan , Tulika Manoj Awalgaonkar , Yilun Zhou , Shelby Heinecke , Sachin Desai , Jason Wu , Ran Xu , Sarah Tan , Jianguo Zhang , Zhiwei Liu , Shirley Kokane , Zuxin Liu , Ming Zhu , Huan Wang , Caiming Xiong , Silvio Savarese

KernelBench: Can LLMs Write Efficient GPU Kernels?

Efficient GPU kernels are crucial for building performant machine learning architectures, but writing them is a time-consuming challenge that requires significant expertise; therefore, we explore using language models (LMs) to automate…

Machine Learning · Computer Science 2025-02-18 Anne Ouyang , Simon Guo , Simran Arora , Alex L. Zhang , William Hu , Christopher Ré , Azalia Mirhoseini

MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents

Large language model (LLM)-based mobile agents are increasingly popular due to their capability to interact directly with mobile phone Graphic User Interfaces (GUIs) and their potential to autonomously manage daily tasks. Despite their…

Artificial Intelligence · Computer Science 2024-06-13 Luyuan Wang , Yongyu Deng , Yiwei Zha , Guodong Mao , Qinmin Wang , Tianchen Min , Wei Chen , Shoufa Chen

PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms

Deploying large language models (LLMs) locally on mobile devices is advantageous in scenarios where transmitting data to remote cloud servers is either undesirable due to privacy concerns or impractical due to network connection. Recent…

Machine Learning · Computer Science 2025-01-10 Yilong Li , Jingyu Liu , Hao Zhang , M Badri Narayanan , Utkarsh Sharma , Shuai Zhang , Pan Hu , Yijing Zeng , Jayaram Raghuram , Suman Banerjee

Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents

With the remarkable advancements of large language models (LLMs), LLM-based agents have become a research hotspot in human-computer interaction. However, there is a scarcity of benchmarks available for LLM-based mobile agents. Benchmarking…

Artificial Intelligence · Computer Science 2024-07-02 Shihan Deng , Weikai Xu , Hongda Sun , Wei Liu , Tao Tan , Jianfeng Liu , Ang Li , Jian Luan , Bin Wang , Rui Yan , Shuo Shang

MELTing point: Mobile Evaluation of Language Transformers

Transformers have revolutionized the machine learning landscape, gradually making their way into everyday tasks and equipping our computers with "sparks of intelligence". However, their runtime requirements have prevented them from being…

Machine Learning · Computer Science 2024-07-29 Stefanos Laskaridis , Kleomenis Katevas , Lorenzo Minto , Hamed Haddadi

Towards Automated Kernel Generation in the Era of LLMs

The performance of modern AI systems is fundamentally constrained by the quality of their underlying kernels, which translate high-level algorithmic semantics into low-level hardware operations. Achieving near-optimal kernels requires…

Machine Learning · Computer Science 2026-01-27 Yang Yu , Peiyu Zang , Chi Hsu Tsai , Haiming Wu , Yixin Shen , Jialing Zhang , Haoyu Wang , Zhiyou Xiao , Jingze Shi , Yuyu Luo , Wentao Zhang , Chunlei Men , Guang Liu , Yonghua Lin

MNN-LLM: A Generic Inference Engine for Fast Large Language Model Deployment on Mobile Devices

Large language models (LLMs) have demonstrated exceptional performance across a variety of tasks. However, their substantial scale leads to significant computational resource consumption during inference, resulting in high costs.…

Machine Learning · Computer Science 2025-06-13 Zhaode Wang , Jingbang Yang , Xinyu Qian , Shiwen Xing , Xiaotang Jiang , Chengfei Lv , Shengyu Zhang

MobileLLM-Pro Technical Report

Efficient on-device language models around 1 billion parameters are essential for powering low-latency AI applications on mobile and wearable devices. However, achieving strong performance in this model class, while supporting long context…

Machine Learning · Computer Science 2025-11-11 Patrick Huber , Ernie Chang , Wei Wen , Igor Fedorov , Tarek Elgamal , Hanxian Huang , Naveen Suda , Chinnadhurai Sankar , Vish Vogeti , Yanghan Wang , Alex Gladkov , Kai Sheng Tai , Abdelrahman Elogeel , Tarek Hefny , Vikas Chandra , Ahmed Aly , Anuj Kumar , Raghuraman Krishnamoorthi , Adithya Sagar

MobileQuant: Mobile-friendly Quantization for On-device Language Models

Large language models (LLMs) have revolutionized language processing, delivering outstanding results across multiple applications. However, deploying LLMs on edge devices poses several challenges with respect to memory, energy, and compute…

Computation and Language · Computer Science 2024-10-07 Fuwen Tan , Royson Lee , Łukasz Dudziak , Shell Xu Hu , Sourav Bhattacharya , Timothy Hospedales , Georgios Tzimiropoulos , Brais Martinez

MobileExperts: A Dynamic Tool-Enabled Agent Team in Mobile Devices

The attainment of autonomous operations in mobile computing devices has consistently been a goal of human pursuit. With the development of Large Language Models (LLMs) and Visual Language Models (VLMs), this aspiration is progressively…

Artificial Intelligence · Computer Science 2024-07-08 Jiayi Zhang , Chuang Zhao , Yihan Zhao , Zhaoyang Yu , Ming He , Jianping Fan

AscendKernelGen: A Systematic Study of LLM-Based Kernel Generation for Neural Processing Units

To meet the ever-increasing demand for computational efficiency, Neural Processing Units (NPUs) have become critical in modern AI infrastructure. However, unlocking their full potential requires developing high-performance compute kernels…

Artificial Intelligence · Computer Science 2026-04-20 Xinzi Cao , Jianyang Zhai , Pengfei Li , Zhiheng Hu , Cen Yan , Bingxu Mu , Guanghuan Fang , Bin She , Jiayu Li , Yihan Su , Dongyang Tao , Xiansong Huang , Fan Xu , Feidiao Yang , Yao Lu , Chang-Dong Wang , Yutong Lu , Weicheng Xue , Bin Zhou , Yonghong Tian

HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios

Evaluating the performance of LLMs in multi-turn human-agent interactions presents significant challenges, particularly due to the complexity and variability of user behavior. In this paper, we introduce HammerBench, a novel benchmark…

Computation and Language · Computer Science 2025-02-18 Jun Wang , Jiamu Zhou , Muning Wen , Xiaoyun Mo , Haoyu Zhang , Qiqiang Lin , Cheng Jin , Xihuai Wang , Weinan Zhang , Qiuying Peng , Jun Wang

SmartBench: Is Your LLM Truly a Good Chinese Smartphone Assistant?

Large Language Models (LLMs) have become integral to daily life, especially advancing as intelligent assistants through on-device deployment on smartphones. However, existing LLM evaluation benchmarks predominantly focus on objective tasks…

Computation and Language · Computer Science 2025-08-27 Xudong Lu , Haohao Gao , Renshou Wu , Shuai Ren , Xiaoxin Chen , Hongsheng Li , Fangyuan Li

Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark

Rapid advancements in large language models (LLMs) have increased interest in deploying them on mobile devices for on-device AI applications. Mobile users interact differently with LLMs compared to desktop users, creating unique…

Computation and Language · Computer Science 2025-03-27 Sondos Mahmoud Bsharat , Mukul Ranjan , Aidar Myrzakhan , Jiacheng Liu , Bowei Guo , Shengkun Tang , Zhuang Liu , Yuanzhi Li , Zhiqiang Shen

LLMCad: Fast and Scalable On-device Large Language Model Inference

Generative tasks, such as text generation and question answering, hold a crucial position in the realm of mobile applications. Due to their sensitivity to privacy concerns, there is a growing demand for their execution directly on mobile…

Networking and Internet Architecture · Computer Science 2023-09-11 Daliang Xu , Wangsong Yin , Xin Jin , Ying Zhang , Shiyun Wei , Mengwei Xu , Xuanzhe Liu

Understanding the Weakness of Large Language Model Agents within a Complex Android Environment

Large language models (LLMs) have empowered intelligent agents to execute intricate tasks within domain-specific software such as browsers and games. However, when applied to general-purpose software systems like operating systems, LLM…

Artificial Intelligence · Computer Science 2024-02-12 Mingzhe Xing , Rongkai Zhang , Hui Xue , Qi Chen , Fan Yang , Zhen Xiao

MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

Autonomous agents powered by large language models (LLMs) show promising potential in assistive tasks across various domains, including mobile device control. As these agents interact directly with personal information and device settings,…

Machine Learning · Computer Science 2026-01-28 Juyong Lee , Dongyoon Hahm , June Suk Choi , W. Bradley Knox , Kimin Lee

Large Language Models for Code Generation: A Comprehensive Survey of Challenges, Techniques, Evaluation, and Applications

Large Language Models (LLMs) have demonstrated their remarkable capabilities in numerous fields. This survey focuses on how LLMs empower users, regardless of their technical background, to use human languages to automatically generate…

Software Engineering · Computer Science 2025-04-03 Nam Huynh , Beiyu Lin