Related papers: Intel nGraph: An Intermediate Representation, Comp…

Deep Graph Library Optimizations for Intel(R) x86 Architecture

The Deep Graph Library (DGL) was designed as a tool to enable structure learning from graphs, by supporting a core abstraction for graphs, including the popular Graph Neural Networks (GNN). DGL contains implementations of all core graph…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-14 Sasikanth Avancha , Vasimuddin Md , Sanchit Misra , Ramanarayan Mohanty

Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks

Advancing research in the emerging field of deep graph learning requires new tools to support tensor computation over graphs. In this paper, we present the design principles and implementation of Deep Graph Library (DGL). DGL distills the…

Machine Learning · Computer Science 2020-08-26 Minjie Wang , Da Zheng , Zihao Ye , Quan Gan , Mufei Li , Xiang Song , Jinjing Zhou , Chao Ma , Lingfan Yu , Yu Gai , Tianjun Xiao , Tong He , George Karypis , Jinyang Li , Zheng Zhang

oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation

With the rapid development of deep learning models and hardware support for dense computing, the deep learning workload characteristics changed significantly from a few hot spots on compute-intensive operations to a broad range of…

Machine Learning · Computer Science 2024-03-12 Jianhui Li , Zhennan Qin , Yijie Mei , Jingze Cui , Yunfei Song , Ciyong Chen , Yifei Zhang , Longsheng Du , Xianhang Cheng , Baihui Jin , Yan Zhang , Jason Ye , Eric Lin , Dan Lavery

CNNLab: a Novel Parallel Framework for Neural Networks using GPU and FPGA-a Practical Study with Trade-off Analysis

Designing and implementing efficient, provably correct parallel neural network processing is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads…

Machine Learning · Computer Science 2016-06-21 Maohua Zhu , Liu Liu , Chao Wang , Yuan Xie

nGraph-HE: A Graph Compiler for Deep Learning on Homomorphically Encrypted Data

Homomorphic encryption (HE)---the ability to perform computation on encrypted data---is an attractive remedy to increasing concerns about data privacy in deep learning (DL). However, building DL models that operate on ciphertext is…

Cryptography and Security · Computer Science 2019-04-03 Fabian Boemer , Yixing Lao , Rosario Cammarota , Casimir Wierzynski

DLA: Compiler and FPGA Overlay for Neural Network Inference Acceleration

Overlays have shown significant promise for field-programmable gate-arrays (FPGAs) as they allow for fast development cycles and remove many of the challenges of the traditional FPGA hardware design flow. However, this often comes with a…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-07-18 Mohamed S. Abdelfattah , David Han , Andrew Bitar , Roberto DiCecco , Shane OConnell , Nitika Shanker , Joseph Chu , Ian Prins , Joshua Fender , Andrew C. Ling , Gordon R. Chiu

INR-Arch: A Dataflow Architecture and Compiler for Arbitrary-Order Gradient Computations in Implicit Neural Representation Processing

An increasing number of researchers are finding use for nth-order gradient computations for a wide variety of applications, including graphics, meta-learning (MAML), scientific computing, and most recently, implicit neural representations…

Hardware Architecture · Computer Science 2025-10-27 Stefan Abi-Karam , Rishov Sarkar , Dejia Xu , Zhiwen Fan , Zhangyang Wang , Cong Hao

PowerFusion: A Tensor Compiler with Explicit Data Movement Description and Instruction-level Graph IR

Deep neural networks (DNNs) are of critical use in different domains. To accelerate DNN computation, tensor compilers are proposed to generate efficient code on different domain-specific accelerators. Existing tensor compilers mainly focus…

Machine Learning · Computer Science 2023-07-12 Zixuan Ma , Haojie Wang , Jingze Xing , Liyan Zheng , Chen Zhang , Huanqi Cao , Kezhao Huang , Shizhi Tang , Penghan Wang , Jidong Zhai

Learning to Optimize Tensor Programs

We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective…

Machine Learning · Computer Science 2019-01-10 Tianqi Chen , Lianmin Zheng , Eddie Yan , Ziheng Jiang , Thierry Moreau , Luis Ceze , Carlos Guestrin , Arvind Krishnamurthy

Dragon: A Computation Graph Virtual Machine Based Deep Learning Framework

Deep Learning has made a great progress for these years. However, it is still difficult to master the implement of various models because different researchers may release their code based on different frameworks or interfaces. In this…

Software Engineering · Computer Science 2017-07-28 Ting Pan

DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators

The convolutional neural network (CNN) has become a state-of-the-art method for several artificial intelligence domains in recent years. The increasingly complex CNN models are both computation-bound and I/O-bound. FPGA-based accelerators…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-26 Yu Xing , Shuang Liang , Lingzhi Sui , Xijie Jia , Jiantao Qiu , Xin Liu , Yushun Wang , Yu Wang , Yi Shan

Bring Your Own Codegen to Deep Learning Compiler

Deep neural networks (DNNs) have been ubiquitously applied in many applications, and accelerators are emerged as an enabler to support the fast and efficient inference tasks of these applications. However, to achieve high model coverage…

Machine Learning · Computer Science 2021-05-10 Zhi Chen , Cody Hao Yu , Trevor Morris , Jorn Tuyls , Yi-Hsiang Lai , Jared Roesch , Elliott Delaye , Vin Sharma , Yida Wang

Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study

Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code that supports symbolic, graph-based Deep…

Software Engineering · Computer Science 2022-07-20 Tatiana Castro Vélez , Raffi Khatchadourian , Mehdi Bagherzadeh , Anita Raja

A Metaprogramming and Autotuning Framework for Deploying Deep Learning Applications

In recent years, deep neural networks (DNNs), have yielded strong results on a wide range of applications. Graphics Processing Units (GPUs) have been one key enabling factor leading to the current popularity of DNNs. However, despite…

Neural and Evolutionary Computing · Computer Science 2016-11-22 Matthew W. Moskewicz , Ali Jannesari , Kurt Keutzer

Scheduling Computation Graphs of Deep Learning Models on Manycore CPUs

For a deep learning model, efficient execution of its computation graph is key to achieving high performance. Previous work has focused on improving the performance for individual nodes of the computation graph, while ignoring the…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-07-26 Linpeng Tang , Yida Wang , Theodore L. Willke , Kai Li

Leveraging Neural Graph Compilers in Machine Learning Research for Edge-Cloud Systems

This work presents a comprehensive evaluation of neural network graph compilers across heterogeneous hardware platforms, addressing the critical gap between theoretical optimization techniques and practical deployment scenarios. We…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-30 Alireza Furutanpey , Carmen Walser , Philipp Raith , Pantelis A. Frangoudis , Schahram Dustdar

Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Modern deep neural networks increasingly make use of features such as dynamic control flow, data structures and dynamic tensor shapes. Existing deep learning systems focus on optimizing and executing static neural networks which assume a…

Programming Languages · Computer Science 2021-03-15 Haichen Shen , Jared Roesch , Zhi Chen , Wei Chen , Yong Wu , Mu Li , Vin Sharma , Zachary Tatlock , Yida Wang

Evaluating Deep Graph Neural Networks

Graph Neural Networks (GNNs) have already been widely applied in various graph mining tasks. However, they suffer from the shallow architecture issue, which is the key impediment that hinders the model performance improvement. Although…

Machine Learning · Computer Science 2021-08-03 Wentao Zhang , Zeang Sheng , Yuezihan Jiang , Yikuan Xia , Jun Gao , Zhi Yang , Bin Cui

Using Graph Neural Networks to model the performance of Deep Neural Networks

With the unprecedented proliferation of machine learning software, there is an ever-increasing need to generate efficient code for such applications. State-of-the-art deep-learning compilers like TVM and Halide incorporate a learning-based…

Machine Learning · Computer Science 2021-08-31 Shikhar Singh , Benoit Steiner , James Hegarty , Hugh Leather

ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations

Deep learning models rely on highly optimized tensor libraries for efficient inference on heterogeneous hardware. Current deep compilers typically predetermine layouts of tensors and then optimize loops of operators. However, such…

Machine Learning · Computer Science 2022-11-01 Zhiying Xu , Jiafan Xu , Hongding Peng , Wei Wang , Xiaoliang Wang , Haoran Wan , Haipeng Dai , Yixu Xu , Hao Cheng , Kun Wang , Guihai Chen