English
Related papers

Related papers: ACC Saturator: Automatic Kernel Optimization for D…

200 papers

Generating high-performance code for diverse hardware and application domains is challenging. Functional array programming languages with patterns like map and reduce have been successfully combined with term rewriting to define and explore…

Programming Languages · Computer Science 2022-06-06 Thomas Koehler , Phil Trinder , Michel Steuwer

Future computing systems, from handhelds to supercomputers, will undoubtedly be more parallel and heterogeneous than todays systems to provide more performance and energy efficiency. Thus, GPUs are increasingly being used to accelerate…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-18 Saeed Taheri , Apan Qasem , Martin Burtscher

Optimizations in a traditional compiler are applied sequentially, with each optimization destructively modifying the program to produce a transformed program that is then passed to the next optimization. We present a new approach for…

Programming Languages · Computer Science 2015-07-01 Ross Tate , Michael Stepp , Zachary Tatlock , Sorin Lerner

Automatic code generation is frequently used to create implementations of algorithms specifically tuned to particular hardware and application parameters. The code generation process involves the selection of adequate code transformations,…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-08 Dominik Ernst , Markus Holzer , Georg Hager , Matthias Knorr , Gerhard Wellein

Optimizing the performance of GPU kernels is challenging for both human programmers and code generators. For example, CUDA programmers must set thread and block parameters for a kernel, but might not have the intuition to make a good…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-06-30 Robert V. Lim , Boyana Norris , Allen D. Malony

As the demand for computational power grows, optimizing code through compilers becomes increasingly crucial. In this context, we focus on fully automatic code optimization techniques that automate the process of selecting and applying code…

Programming Languages · Computer Science 2025-11-11 Yacine Hakimi , Riyadh Baghdadi

This work deals with the optimization of computer programs targeting Graphics Processing Units (GPUs). The goal is to lift, from programmers to optimizing compilers, the heavy burden of determining program details that are dependent on the…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-16 Xiaohui Chen , Marc Moreno-Maza , Jeeva Paudel , Ning Xie

GPU code optimization is a key performance bottleneck for HPC workloads as well as large-model training and inference. Although compiler optimizations and hand-written kernels can partially alleviate this issue, achieving…

Computation and Language · Computer Science 2026-01-26 Qiuyi Qu , Yicheng Sui , Yufei Sun , Rui Chen , Xiaofei Zhang , Yuzhi Zhang , Haofeng Wang , Ge Lan

Recent years have witnessed phenomenal growth in the application, and capabilities of Graphical Processing Units (GPUs) due to their high parallel computation power at relatively low cost. However, writing a computationally efficient GPU…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-05 Richard Schoonhoven , Ben van Werkhoven , Kees Joost Batenburg

One of the major optimizations employed in deep learning frameworks is graph rewriting. Production frameworks rely on heuristics to decide if rewrite rules should be applied and in which order. Prior research has shown that one can discover…

Artificial Intelligence · Computer Science 2021-03-18 Yichen Yang , Phitchaya Mangpo Phothilimtha , Yisu Remy Wang , Max Willsey , Sudip Roy , Jacques Pienaar

This paper proposes an adaptive neural-compilation framework to address the problem of efficient program learning. Traditional code optimisation strategies used in compilers are based on applying pre-specified set of transformations that…

Artificial Intelligence · Computer Science 2016-05-27 Rudy Bunel , Alban Desmaison , Pushmeet Kohli , Philip H. S. Torr , M. Pawan Kumar

We implement a quantum optimal control algorithm based on automatic differentiation and harness the acceleration afforded by graphics processing units (GPUs). Automatic differentiation allows us to specify advanced optimization criteria and…

Quantum Physics · Physics 2017-04-19 Nelson Leung , Mohamed Abdelhafez , Jens Koch , David I. Schuster

Generation of optimal codes is a well known problem in coding theory. Many computational approaches exist in the literature for finding record breaking codes. However generating codes with long lengths $n$ using serial algorithms is…

Information Theory · Computer Science 2015-07-21 Srajan Paliwal , Saurabh Tiwary , Bhaskar Chaudhury , Manish K. Gupta

Discrete optimization is a central problem in artificial intelligence. The optimization of the aggregated cost of a network of cost functions arises in a variety of problems including (W)CSP, DCOP, as well as optimization in stochastic…

Artificial Intelligence · Computer Science 2018-01-12 Ferdinando Fioretto , Enrico Pontelli , William Yeoh , Rina Dechter

As computing system become more complex, it is becoming harder for programmers to keep their codes optimized as the hardware gets updated. Autotuners try to alleviate this by hiding as many architecture-based optimization details as…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-03-17 Jacob O. Tørring , Ben van Werkhoven , Filip Petrovic , Floris-Jan Willemsen , Jirí Filipovic , Anne C. Elster

Optimizing the performance of computational fluid dynamics (CFD) applications accelerated by graphics processing units (GPUs) is crucial for efficient simulations. In this study, we employed a machine learning-based autotuning technique to…

Performance · Computer Science 2024-02-21 Weicheng Xue , Christohper John Roy

In this submission, we explore the use of equality saturation to optimize concurrent computations. A concurrent environment gives rise to new optimization opportunities, like extracting a common concurrent subcomputation. To our knowledge,…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-15 Henrich Lauko , Lukáš Korenčik , Peter Goodman

Achieving high performance for GPU codes requires developers to have significant knowledge in parallel programming and GPU architectures, and in-depth understanding of the application. This combination makes it challenging to find…

Software Engineering · Computer Science 2022-08-29 Jhe-Yu Liou , Muaaz Awan , Steven Hofmeyr , Stephanie Forrest , Carole-Jean Wu

We introduce a code generator that converts unoptimized C++ code operating on sparse data into vectorized and parallel CPU or GPU kernels. Our approach unrolls the computation into a massive expression graph, performs redundant expression…

Programming Languages · Computer Science 2022-03-15 Philipp Herholz , Xuan Tang , Teseo Schneider , Shoaib Kamil , Daniele Panozzo , Olga Sorkine-Hornung

Compilers are indispensable for transforming code written in high-level languages into performant machine code, but their general-purpose optimizations sometimes fall short. Domain experts might be aware of certain optimizations that the…

Programming Languages · Computer Science 2025-07-15 Jules Merckx , Tim Besard , Bjorn De Sutter
‹ Prev 1 2 3 10 Next ›