Related papers: AI Powered Compiler Techniques for DL Code Optimiz…

Towards an Achievable Performance for the Loop Nests

Numerous code optimization techniques, including loop nest optimizations, have been developed over the last four decades. Loop optimization techniques transform loop nests to improve the performance of the code on a target architecture,…

Performance · Computer Science 2019-11-27 Aniket Shivam , Neftali Watkinson , Alexandru Nicolau , David Padua , Alexander V. Veidenbaum

PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives

Deep Neural Networks (DNNs) have revolutionized many aspects of our lives. The use of DNNs is becoming ubiquitous including in softwares for image recognition, speech recognition, speech synthesis, language translation, to name a few. he…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-18 Sanket Tavarageri , Alexander Heinecke , Sasikanth Avancha , Gagandeep Goyal , Ramakrishna Upadrasta , Bharat Kaul

Cortex: A Compiler for Recursive Deep Learning Models

Optimizing deep learning models is generally performed in two steps: (i) high-level graph optimizations such as kernel fusion and (ii) low level kernel optimizations such as those found in vendor libraries. This approach often leaves…

Machine Learning · Computer Science 2021-03-08 Pratik Fegade , Tianqi Chen , Phillip B. Gibbons , Todd C. Mowry

Agentic Code Optimization via Compiler-LLM Cooperation

Generating performant executables from high level languages is critical to software performance across a wide range of domains. Modern compilers perform this task by passing code through a series of well-studied optimizations at…

Programming Languages · Computer Science 2026-04-07 Benjamin Mikek , Danylo Vashchilenko , Bryan Lu , Panpan Xu

PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives

At the heart of deep learning training and inferencing are computationally intensive primitives such as convolutions which form the building blocks of deep neural networks. Researchers have taken two distinct approaches to creating high…

Programming Languages · Computer Science 2020-02-07 Sanket Tavarageri , Alexander Heinecke , Sasikanth Avancha , Gagandeep Goyal , Ramakrishna Upadrasta , Bharat Kaul

Adaptive Neural Compilation

This paper proposes an adaptive neural-compilation framework to address the problem of efficient program learning. Traditional code optimisation strategies used in compilers are based on applying pre-specified set of transformations that…

Artificial Intelligence · Computer Science 2016-05-27 Rudy Bunel , Alban Desmaison , Pushmeet Kohli , Philip H. S. Torr , M. Pawan Kumar

Agentic Auto-Scheduling: An Experimental Study of LLM-Guided Loop Optimization

Automatic code optimization remains a difficult challenge, particularly for complex loop nests on modern hardware. This paper investigates a novel approach to code optimization where Large Language Models (LLMs) guide the process through a…

Programming Languages · Computer Science 2025-12-30 Massinissa Merouani , Islem Kara Bernou , Riyadh Baghdadi

Categorization of Program Regions for Agile Compilation using Machine Learning and Hardware Support

A compiler processes the code written in a high level language and produces machine executable code. The compiler writers often face the challenge of keeping the compilation times reasonable. That is because aggressive optimization passes…

Programming Languages · Computer Science 2019-05-30 Sanket Tavarageri

High Performance GPU Code Generation for Matrix-Matrix Multiplication using MLIR: Some Early Results

This report presents some early results on code generation targeting tensor cores on NVIDIA GPUs using the MLIR compiler infrastructure. The state-of-the-art in high-performance deep learning today is primarily driven by manually optimized…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-08-31 Navdeep Katel , Vivek Khandelwal , Uday Bondhugula

Learning to Make Compiler Optimizations More Effective

Because loops execute their body many times, compiler developers place much emphasis on their optimization. Nevertheless, in view of highly diverse source code and hardware, compilers still struggle to produce optimal target code. The sheer…

Programming Languages · Computer Science 2021-03-01 Rahim Mammadli , Marija Selakovic , Felix Wolf , Michael Pradel

Should AI Optimize Your Code? A Comparative Study of Classical Optimizing Compilers Versus Current Large Language Models

Traditional optimizing compilers have played an important role in adapting to the growing complexity of modern software systems. The need for efficient parallel programming in current architectures requires strong optimization techniques.…

Artificial Intelligence · Computer Science 2025-04-03 Miguel Romero Rosas , Miguel Torres Sanchez , Rudolf Eigenmann

DLFusion: An Auto-Tuning Compiler for Layer Fusion on Deep Neural Network Accelerator

Many hardware vendors have introduced specialized deep neural networks (DNN) accelerators owing to their superior performance and efficiency. As such, how to generate and optimize the code for the hardware accelerator becomes an important…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-12 Zihan Liu , Jingwen Leng , Quan Chen , Chao Li , Wenli Zheng , Li Li , Minyi Guo

TensorIR: An Abstraction for Automatic Tensorized Program Optimization

Deploying deep learning models on various devices has become an important topic. The wave of hardware specialization brings a diverse set of acceleration primitives for multi-dimensional tensor computations. These new acceleration…

Machine Learning · Computer Science 2022-10-31 Siyuan Feng , Bohan Hou , Hongyi Jin , Wuwei Lin , Junru Shao , Ruihang Lai , Zihao Ye , Lianmin Zheng , Cody Hao Yu , Yong Yu , Tianqi Chen

MCompiler: A Synergistic Compilation Framework

This paper presents a meta-compilation framework, the MCompiler. The main idea is that different segments of a program can be compiled with different compilers/optimizers and combined into a single executable. The MCompiler can be used in a…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-31 Aniket Shivam , Alexandru Nicolau , Alexander V. Veidenbaum

Progress Report: A Deep Learning Guided Exploration of Affine Unimodular Loop Transformations

In this paper, we present a work in progress about a deep learning based approach for automatic code optimization in polyhedral compilers. The proposed technique explores combinations of affine and non-affine loop transformations to find…

Programming Languages · Computer Science 2022-06-09 Massinissa Merouani , Khaled Afif Boudaoud , Iheb Nassim Aouadj , Nassim Tchoulak , Fatima Benbouzid-Sitayeb , Karima Benatchba , Hugh Leather , Riyadh Baghdadi

SimulatorCoder: DNN Accelerator Simulator Code Generation and Optimization via Large Language Models

This paper presents SimulatorCoder, an agent powered by large language models (LLMs), designed to generate and optimize deep neural network (DNN) accelerator simulators based on natural language descriptions. By integrating domain-specific…

Hardware Architecture · Computer Science 2026-02-20 Yuhuan Xia , Tun Li , Hongji Zhou , Xianfa Zhou , Chong Chen , Ruiyu Zhang

Pearl: Automatic Code Optimization Using Deep Reinforcement Learning

Compilers are crucial in optimizing programs and accelerating their execution. However, optimizing programs automatically using compilers is not trivial. Recent work has attempted to use reinforcement learning (RL) to solve this problem. It…

Programming Languages · Computer Science 2025-06-03 Djamel Rassem Lamouri , Iheb Nassim Aouadj , Smail Kourta , Riyadh Baghdadi

Learning Performance-Improving Code Edits

With the decline of Moore's law, optimizing program performance has become a major focus of software research. However, high-level optimizations such as API and algorithm changes remain elusive due to the difficulty of understanding the…

Software Engineering · Computer Science 2024-04-29 Alexander Shypula , Aman Madaan , Yimeng Zeng , Uri Alon , Jacob Gardner , Milad Hashemi , Graham Neubig , Parthasarathy Ranganathan , Osbert Bastani , Amir Yazdanbakhsh

PolyBlocks: A Compiler Infrastructure for AI Chips and Programming Frameworks

We present the design and implementation of PolyBlocks, a modular and reusable MLIR-based compiler infrastructure for AI programming frameworks and AI chips. PolyBlocks is based on pass pipelines that compose transformations on loop nests…

Programming Languages · Computer Science 2026-03-11 Uday Bondhugula , Akshay Baviskar , Navdeep Katel , Vimal Patel , Anoop JS , Arnab Dutta

Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation

Achieving faster execution with shorter compilation time can foster further diversity and innovation in neural networks. However, the current paradigm of executing neural networks either relies on hand-optimized libraries, traditional…

Machine Learning · Computer Science 2020-01-27 Byung Hoon Ahn , Prannoy Pilligundla , Amir Yazdanbakhsh , Hadi Esmaeilzadeh