Related papers: TensorIR: An Abstraction for Automatic Tensorized …

Gensor: A Graph-based Construction Tensor Compilation Method for Deep Learning

High-performance deep learning depends on efficient tensor programs. In recent years, automatic tensor program optimization, also known as tensor compilation, has emerged as the primary approach to generating efficient tensor programs.…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-18 Hangda Liu , Boyu Diao , Yu Yang , Wenxin Chen , Xiaohui Peng , Yongjun Xu

Tenspiler: A Verified Lifting-Based Compiler for Tensor Operations (Extended Version)

Tensor processing infrastructures such as deep learning frameworks and specialized hardware accelerators have revolutionized how computationally intensive code from domains such as deep learning and image processing is executed and…

Programming Languages · Computer Science 2024-12-17 Jie Qiu , Colin Cai , Sahil Bhatia , Niranjan Hasabnis , Sanjit A. Seshia , Alvin Cheung

Learning to Optimize Tensor Programs

We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective…

Machine Learning · Computer Science 2019-01-10 Tianqi Chen , Lianmin Zheng , Eddie Yan , Ziheng Jiang , Thierry Moreau , Luis Ceze , Carlos Guestrin , Arvind Krishnamurthy

The ITensor Software Library for Tensor Network Calculations

ITensor is a system for programming tensor network calculations with an interface modeled on tensor diagram notation, which allows users to focus on the connectivity of a tensor network without manually bookkeeping tensor indices. The…

Mathematical Software · Computer Science 2023-03-07 Matthew Fishman , Steven R. White , E. Miles Stoudenmire

Ansor: Generating High-Performance Tensor Programs for Deep Learning

High-performance tensor programs are crucial to guarantee efficient execution of deep neural networks. However, obtaining performant tensor programs for different operators on various hardware platforms is notoriously challenging.…

Machine Learning · Computer Science 2023-10-17 Lianmin Zheng , Chengfan Jia , Minmin Sun , Zhao Wu , Cody Hao Yu , Ameer Haj-Ali , Yida Wang , Jun Yang , Danyang Zhuo , Koushik Sen , Joseph E. Gonzalez , Ion Stoica

SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning

Sparse tensors are rapidly becoming critical components of modern deep learning workloads. However, developing high-performance sparse operators can be difficult and tedious, and existing vendor libraries cannot satisfy the escalating…

Machine Learning · Computer Science 2023-02-22 Zihao Ye , Ruihang Lai , Junru Shao , Tianqi Chen , Luis Ceze

SparseLNR: Accelerating Sparse Tensor Computations Using Loop Nest Restructuring

Sparse tensor algebra computations have become important in many real-world applications like machine learning, scientific simulations, and data mining. Hence, automated code generation and performance optimizations for tensor algebra…

Programming Languages · Computer Science 2022-05-25 Adhitha Dias , Kirshanthan Sundararajah , Charitha Saumya , Milind Kulkarni

Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning & HPC Workloads

During the past decade, novel Deep Learning (DL) algorithms, workloads and hardware have been developed to tackle a wide range of problems. Despite the advances in workload and hardware ecosystems, the programming methodology of DL systems…

Artificial Intelligence · Computer Science 2021-12-02 Evangelos Georganas , Dhiraj Kalamkar , Sasikanth Avancha , Menachem Adelman , Deepti Aggarwal , Cristina Anderson , Alexander Breuer , Jeremy Bruestle , Narendra Chaudhary , Abhisek Kundu , Denise Kutnick , Frank Laub , Vasimuddin Md , Sanchit Misra , Ramanarayan Mohanty , Hans Pabst , Brian Retford , Barukh Ziv , Alexander Heinecke

PowerFusion: A Tensor Compiler with Explicit Data Movement Description and Instruction-level Graph IR

Deep neural networks (DNNs) are of critical use in different domains. To accelerate DNN computation, tensor compilers are proposed to generate efficient code on different domain-specific accelerators. Existing tensor compilers mainly focus…

Machine Learning · Computer Science 2023-07-12 Zixuan Ma , Haojie Wang , Jingze Xing , Liyan Zheng , Chen Zhang , Huanqi Cao , Kezhao Huang , Shizhi Tang , Penghan Wang , Jidong Zhai

ACT: Automatically Generating Compiler Backends from Tensor Accelerator ISA Descriptions

Tensor compilers play a key role in enabling high-performance implementations of deep learning workloads. These compilers rely on existing CPU and GPU code generation backends to generate device-specific code. Recently, many tensor…

Programming Languages · Computer Science 2025-10-14 Devansh Jain , Akash Pardeshi , Marco Frigo , Krut Patel , Kaustubh Khulbe , Jai Arora , Charith Mendis

TPU-MLIR: A Compiler For TPU Using MLIR

Multi-level intermediate representations (MLIR) show great promise for reducing the cost of building domain-specific compilers by providing a reusable and extensible compiler infrastructure. This work presents TPU-MLIR, an end-to-end…

Programming Languages · Computer Science 2023-02-10 Pengchao Hu , Man Lu , Lei Wang , Guoyue Jiang

AI Powered Compiler Techniques for DL Code Optimization

Creating high performance implementations of deep learning primitives on CPUs is a challenging task. Multiple considerations including multi-level cache hierarchy, and wide SIMD units of CPU platforms influence the choice of program…

Programming Languages · Computer Science 2021-04-13 Sanket Tavarageri , Gagandeep Goyal , Sasikanth Avancha , Bharat Kaul , Ramakrishna Upadrasta

oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation

With the rapid development of deep learning models and hardware support for dense computing, the deep learning workload characteristics changed significantly from a few hot spots on compute-intensive operations to a broad range of…

Machine Learning · Computer Science 2024-03-12 Jianhui Li , Zhennan Qin , Yijie Mei , Jingze Cui , Yunfei Song , Ciyong Chen , Yifei Zhang , Longsheng Du , Xianhang Cheng , Baihui Jin , Yan Zhang , Jason Ye , Eric Lin , Dan Lavery

Pushing Tensor Accelerators Beyond MatMul in a User-Schedulable Language

Tensor accelerators now represent a growing share of compute resources in modern CPUs and GPUs. However, they are hard to program, leading developers to use vendor-provided kernel libraries that support tensor accelerators. As a result, the…

Programming Languages · Computer Science 2026-02-12 Yihong Zhang , Derek Gerstmann , Andrew Adams , Maaz Bin Safeer Ahmad

Compiler Support for Sparse Tensor Computations in MLIR

Sparse tensors arise in problems in science, engineering, machine learning, and data analytics. Programs that operate on such tensors can exploit sparsity to reduce storage requirements and computational time. Developing and maintaining…

Programming Languages · Computer Science 2022-09-20 Aart J. C. Bik , Penporn Koanantakool , Tatiana Shpeisman , Nicolas Vasilache , Bixia Zheng , Fredrik Kjolstad

tenSVD algorithm for compression

Tensors provide a robust framework for managing high-dimensional data. Consequently, tensor analysis has emerged as an active research area in various domains, including machine learning, signal processing, computer vision, graph analysis,…

Computation · Statistics 2025-10-01 Michele Gallo

Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures

During the past decade, Deep Learning (DL) algorithms, programming systems and hardware have converged with the High Performance Computing (HPC) counterparts. Nevertheless, the programming methodology of DL and HPC systems is stagnant,…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-03-19 Evangelos Georganas , Dhiraj Kalamkar , Kirill Voronin , Abhisek Kundu , Antonio Noack , Hans Pabst , Alexander Breuer , Alexander Heinecke

Tensor Methods for Generating Compact Uncertainty Quantification and Deep Learning Models

Tensor methods have become a promising tool to solve high-dimensional problems in the big data era. By exploiting possible low-rank tensor factorization, many high-dimensional model-based or data-driven problems can be solved to facilitate…

Optimization and Control · Mathematics 2019-08-22 Chunfeng Cui , Cole Hawkins , Zheng Zhang

UNIT: Unifying Tensorized Instruction Compilation

Because of the increasing demand for computation in DNN, researchers develope both hardware and software mechanisms to reduce the compute and memory burden. A widely adopted approach is to use mixed precision data types. However, it is hard…

Programming Languages · Computer Science 2021-03-30 Jian Weng , Animesh Jain , Jie Wang , Leyuan Wang , Yida Wang , Tony Nowatzki

PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives

At the heart of deep learning training and inferencing are computationally intensive primitives such as convolutions which form the building blocks of deep neural networks. Researchers have taken two distinct approaches to creating high…

Programming Languages · Computer Science 2020-02-07 Sanket Tavarageri , Alexander Heinecke , Sasikanth Avancha , Gagandeep Goyal , Ramakrishna Upadrasta , Bharat Kaul