Related papers: LOOPer: A Learned Automatic Code Optimizer For Pol…

Progress Report: A Deep Learning Guided Exploration of Affine Unimodular Loop Transformations

In this paper, we present a work in progress about a deep learning based approach for automatic code optimization in polyhedral compilers. The proposed technique explores combinations of affine and non-affine loop transformations to find…

Programming Languages · Computer Science 2022-06-09 Massinissa Merouani , Khaled Afif Boudaoud , Iheb Nassim Aouadj , Nassim Tchoulak , Fatima Benbouzid-Sitayeb , Karima Benatchba , Hugh Leather , Riyadh Baghdadi

LOOPerSet: A Large-Scale Dataset for Data-Driven Polyhedral Compiler Optimization

The advancement of machine learning for compiler optimization, particularly within the polyhedral model, is constrained by the scarcity of large-scale, public performance datasets. This data bottleneck forces researchers to undertake costly…

Programming Languages · Computer Science 2025-12-30 Massinissa Merouani , Afif Boudaoud , Riyadh Baghdadi

Learning to Make Compiler Optimizations More Effective

Because loops execute their body many times, compiler developers place much emphasis on their optimization. Nevertheless, in view of highly diverse source code and hardware, compilers still struggle to produce optimal target code. The sheer…

Programming Languages · Computer Science 2021-03-01 Rahim Mammadli , Marija Selakovic , Felix Wolf , Michael Pradel

A Deep Learning Based Cost Model for Automatic Code Optimization

Enabling compilers to automatically optimize code has been a longstanding goal for the compiler community. Efficiently solving this problem requires using precise cost models. These models predict whether applying a sequence of code…

Programming Languages · Computer Science 2021-04-13 Riyadh Baghdadi , Massinissa Merouani , Mohamed-Hicham Leghettas , Kamel Abdous , Taha Arbaoui , Karima Benatchba , Saman Amarasinghe

An Approach for Finding Permutations Quickly: Fusion and Dimension matching

Polyhedral compilers can perform complex loop optimizations that improve parallelism and cache behaviour of loops in the input program. These transformations result in significant performance gains on modern processors which have large…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-03-29 Aravind Acharya , Uday Bondhugula , Albert Cohen

PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler

Polyhedral techniques have been widely used for automatic code optimization in low-level compilers and higher-level processes. Loop optimization is central to this technique, and several polyhedral schedulers like Feautrier, Pluto, isl and…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-15 Gianpietro Consolaro , Zhen Zhang , Harenome Razanajato , Nelson Lossing , Nassim Tchoulak , Adilla Susungi , Artur Cesar Araujo Alves , Renwei Zhang , Denis Barthou , Corinne Ancourt , Cedric Bastoul

ACPO: AI-Enabled Compiler Framework

The key to performance optimization of a program is to decide correctly when a certain transformation should be applied by a compiler. This is an ideal opportunity to apply machine-learning models to speed up the tuning process; while this…

Programming Languages · Computer Science 2025-01-15 Amir H. Ashouri , Muhammad Asif Manzoor , Duc Minh Vu , Raymond Zhang , Colin Toft , Ziwen Wang , Angel Zhang , Bryan Chan , Tomasz S. Czajkowski , Yaoqing Gao

LoopTune: Optimizing Tensor Computations with Reinforcement Learning

Advanced compiler technology is crucial for enabling machine learning applications to run on novel hardware, but traditional compilers fail to deliver performance, popular auto-tuners have long search times and expert-optimized libraries…

Machine Learning · Computer Science 2023-11-09 Dejan Grubisic , Bram Wasti , Chris Cummins , John Mellor-Crummey , Aleksandar Zlateski

A Performance Vocabulary for Affine Loop Transformations

Modern polyhedral compilers excel at aggressively optimizing codes with static control parts, but the state-of-practice to find high-performance polyhedral transformations especially for different hardware targets still largely involves…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-04-10 Martin Kong , Louis-Noël Pouchet

AI Powered Compiler Techniques for DL Code Optimization

Creating high performance implementations of deep learning primitives on CPUs is a challenging task. Multiple considerations including multi-level cache hierarchy, and wide SIMD units of CPU platforms influence the choice of program…

Programming Languages · Computer Science 2021-04-13 Sanket Tavarageri , Gagandeep Goyal , Sasikanth Avancha , Bharat Kaul , Ramakrishna Upadrasta

LOOPRAG: Enhancing Loop Transformation Optimization with Retrieval-Augmented Large Language Models

Loop transformations are semantics-preserving optimization techniques, widely used to maximize objectives such as parallelism. Despite decades of research, applying the optimal composition of loop transformations remains challenging due to…

Programming Languages · Computer Science 2025-12-19 Yijie Zhi , Yayu Cao , Jianhua Dai , Xiaoyang Han , Jingwen Pu , Qingran Wu , Sheng Cheng , Ming Cai

Agentic Auto-Scheduling: An Experimental Study of LLM-Guided Loop Optimization

Automatic code optimization remains a difficult challenge, particularly for complex loop nests on modern hardware. This paper investigates a novel approach to code optimization where Large Language Models (LLMs) guide the process through a…

Programming Languages · Computer Science 2025-12-30 Massinissa Merouani , Islem Kara Bernou , Riyadh Baghdadi

Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code

This paper introduces Tiramisu, a polyhedral framework designed to generate high performance code for multiple platforms including multicores, GPUs, and distributed machines. Tiramisu introduces a scheduling language with novel extensions…

Programming Languages · Computer Science 2018-12-21 Riyadh Baghdadi , Jessica Ray , Malek Ben Romdhane , Emanuele Del Sozzo , Abdurrahman Akkas , Yunming Zhang , Patricia Suriana , Shoaib Kamil , Saman Amarasinghe

DeepCompile: A Compiler-Driven Approach to Optimizing Distributed Deep Learning Training

The rapid growth of deep learning models has increased the demand for efficient distributed training strategies. Fully sharded approaches like ZeRO-3 and FSDP partition model parameters across GPUs and apply optimizations such as…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-20 Masahiro Tanaka , Du Li , Umesh Chand , Ali Zafar , Haiying Shen , Olatunji Ruwase

Pearl: Automatic Code Optimization Using Deep Reinforcement Learning

Compilers are crucial in optimizing programs and accelerating their execution. However, optimizing programs automatically using compilers is not trivial. Recent work has attempted to use reinforcement learning (RL) to solve this problem. It…

Programming Languages · Computer Science 2025-06-03 Djamel Rassem Lamouri , Iheb Nassim Aouadj , Smail Kourta , Riyadh Baghdadi

FasterPy: An LLM-based Code Execution Efficiency Optimization Framework

Code often suffers from performance bugs. These bugs necessitate the research and practice of code optimization. Traditional rule-based methods rely on manually designing and maintaining rules for specific performance bugs (e.g., redundant…

Software Engineering · Computer Science 2025-12-30 Yue Wu , Minghao Han , Ruiyin Li , Peng Liang , Amjed Tahir , Zengyang Li , Qiong Feng , Mojtaba Shahin

Autocomp: A Powerful and Portable Code Optimizer for Tensor Accelerators

Hardware accelerators, especially those designed for tensor processing, have become ubiquitous in today's computing landscape. However, even with significant efforts in building compilers, programming these tensor accelerators remains…

Programming Languages · Computer Science 2025-11-07 Charles Hong , Sahil Bhatia , Alvin Cheung , Yakun Sophia Shao

Future Trends in the Design of Memetic Algorithms: the Case of the Linear Ordering Problem

The way heuristic optimizers are designed has evolved over the decades, as computing power has increased. Such has been the case for the Linear Ordering Problem (LOP), a field in which trajectory-based strategies led the way during the…

Neural and Evolutionary Computing · Computer Science 2024-10-15 Lázaro Lugo , Carlos Segura , Gara Miranda

Categorization of Program Regions for Agile Compilation using Machine Learning and Hardware Support

A compiler processes the code written in a high level language and produces machine executable code. The compiler writers often face the challenge of keeping the compilation times reasonable. That is because aggressive optimization passes…

Programming Languages · Computer Science 2019-05-30 Sanket Tavarageri

Deploying Customized Data Representation and Approximate Computing in Machine Learning Applications

Major advancements in building general-purpose and customized hardware have been one of the key enablers of versatility and pervasiveness of machine learning models such as deep neural networks. To sustain this ubiquitous deployment of…

Machine Learning · Computer Science 2018-06-05 Mahdi Nazemi , Massoud Pedram