Programming Languages · Computer Science
Compilation Forking: A Fast and Flexible Way of Generating Data for Compiler-Internal Machine Learning Tasks
Raphael Mosaner, David Leopoldseder, Wolfgang Kisling, Lukas Stadler +1
2022-06-29
Distributed, Parallel, and Cluster Computing · Computer Science
Benchmarking optimization algorithms for auto-tuning GPU kernels
Richard Schoonhoven, Ben van Werkhoven, Kees Joost Batenburg
2022-10-05
Distributed, Parallel, and Cluster Computing · Computer Science
Autotuning GPU Kernels via Static and Predictive Analysis
Robert V. Lim, Boyana Norris, Allen D. Malony
2017-06-30
Machine Learning · Computer Science
Hexcute: A Compiler Framework for Automating Layout Synthesis in GPU Programs
Xiao Zhang, Yaoyao Ding, Bolin Sun, Yang Hu +2
2026-01-30
Distributed, Parallel, and Cluster Computing · Computer Science
A Tool for Automatically Suggesting Source-Code Optimizations for Complex GPU Kernels
Saeed Taheri, Apan Qasem, Martin Burtscher
2019-10-18
Distributed, Parallel, and Cluster Computing · Computer Science
Theoretical Foundations of GPU-Native Compilation for Rapid Code Iteration
Adilet Metinov, Gulida M. Kudakeeva, Gulnara D. Kabaeva
2025-12-15
Distributed, Parallel, and Cluster Computing · Computer Science
Compilation Techniques for Graph Algorithms on GPUs
Ajay Brahmakshatriya, Yunming Zhang, Changwan Hong, Shoaib Kamil +2
2021-01-11
Performance · Computer Science
A Learned Performance Model for Tensor Processing Units
Samuel J. Kaufman, Phitchaya Mangpo Phothilimthana, Yanqi Zhou, Charith Mendis +3
2021-03-19
Distributed, Parallel, and Cluster Computing · Computer Science
Using hardware performance counters to speed up autotuning convergence on GPUs
Jiří Filipovič, Jana Hozzová, Amin Nezarat, Jaroslav Oľha +1
2021-09-20
Machine Learning · Computer Science
Learning Heuristics over Large Graphs via Deep Reinforcement Learning
Sahil Manchanda, Akash Mittal, Anuj Dhawan, Sourav Medya +2
2020-12-04
Programming Languages · Computer Science
GPU accelerated program synthesis: Enumerate semantics, not syntax!
Martin Berger, Nathanaël Fijalkow, Mojtaba Valizadeh
2025-04-29
Distributed, Parallel, and Cluster Computing · Computer Science
High Performance GPU Code Generation for Matrix-Matrix Multiplication using MLIR: Some Early Results
Navdeep Katel, Vivek Khandelwal, Uday Bondhugula
2021-08-31
Machine Learning · Computer Science
Reinforced Genetic Algorithm Learning for Optimizing Computation Graphs
Aditya Paliwal, Felix Gimeno, Vinod Nair, Yujia Li +3
2020-02-11
Machine Learning · Computer Science
Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning
Haolin Pan, Hongyu Lin, Haoran Luo, Yang Liu +4
2025-06-23
Artificial Intelligence · Computer Science
Learning Improvement Heuristics for Solving Routing Problems
Yaoxin Wu, Wen Song, Zhiguang Cao, Jie Zhang +1
2020-05-12
Performance · Computer Science
GPA: A GPU Performance Advisor Based on Instruction Sampling
Keren Zhou, Xiaozhu Meng, Ryuichi Sai, John Mellor-Crummey
2020-11-25
Distributed, Parallel, and Cluster Computing · Computer Science
Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning
Luis M. Vaquero, Felix Cuadrado
2018-09-17