English
Related papers

Related papers: TensorLib: A Spatial Accelerator Generation Framew…

200 papers

Modern tensor applications, especially foundation models and generative AI applications require multiple input modalities (both vision and language), which increases the demand for flexible accelerator architecture. Existing frameworks…

Hardware Architecture · Computer Science 2025-09-16 Yujun Lin , Zhekai Zhang , Song Han

High-order tensor decomposition has been widely adopted to obtain compact deep neural networks for edge deployment. However, existing studies focus primarily on its algorithmic advantages such as accuracy and compression ratio-while…

Hardware Architecture · Computer Science 2025-11-26 Jinsong Zhang , Minghe Li , Jiayi Tian , Jinming Lu , Zheng Zhang

In recent years, many accelerators have been proposed to efficiently process sparse tensor algebra applications (e.g., sparse neural networks). However, these proposals are single points in a large and diverse design space. The lack of…

Hardware Architecture · Computer Science 2023-01-11 Yannan Nellie Wu , Po-An Tsai , Angshuman Parashar , Vivienne Sze , Joel S. Emer

Tensor algebra lies at the core of computational science and machine learning. Due to its high usage, entire libraries exist dedicated to improving its performance. Conventional tensor algebra performance boosts focus on algorithmic…

Programming Languages · Computer Science 2022-08-16 Sathvik Redrouthu , Rishi Athavale

Numerical tensor calculus comprise basic tensor operations such as the entrywise addition and contraction of higher-order tensors. We present, TLib, flexible tensor framework with generic tensor functions and tensor classes that assists…

Mathematical Software · Computer Science 2017-11-30 Cem Bassoy

Recently, numerous sparse hardware accelerators for Deep Neural Networks (DNNs), Graph Neural Networks (GNNs), and scientific computing applications have been proposed. A common characteristic among all of these accelerators is that they…

Recently, tensor algebra have witnessed significant applications across various domains. Each operator in tensor algebra features different computational workload and precision. However, current general accelerators, such as VPU, GPGPU, and…

Hardware Architecture · Computer Science 2024-05-06 Chenyang Ai , Lechuan Zhao , Zhijie Huang , Cangyuan Li , Xinan Wang , Ying Wang

Tensor accelerators now represent a growing share of compute resources in modern CPUs and GPUs. However, they are hard to program, leading developers to use vendor-provided kernel libraries that support tensor accelerators. As a result, the…

Programming Languages · Computer Science 2026-02-12 Yihong Zhang , Derek Gerstmann , Andrew Adams , Maaz Bin Safeer Ahmad

High-performance deep learning depends on efficient tensor programs. In recent years, automatic tensor program optimization, also known as tensor compilation, has emerged as the primary approach to generating efficient tensor programs.…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-18 Hangda Liu , Boyu Diao , Yu Yang , Wenxin Chen , Xiaohui Peng , Yongjun Xu

Sparse tensor algebra computations have become important in many real-world applications like machine learning, scientific simulations, and data mining. Hence, automated code generation and performance optimizations for tensor algebra…

Programming Languages · Computer Science 2022-05-25 Adhitha Dias , Kirshanthan Sundararajah , Charitha Saumya , Milind Kulkarni

Recent years have seen considerable work on compiling sparse tensor algebra expressions. This paper addresses a shortcoming in that work, namely how to generate efficient code (in time and space) that scatters values into a sparse result…

Programming Languages · Computer Science 2024-04-09 Genghan Zhang , Olivia Hsu , Fredrik Kjolstad

Efficient execution of deep learning workloads on dataflow architectures is crucial for overcoming memory bottlenecks and maximizing performance. While streaming intermediate results between computation kernels can significantly improve…

Hardware Architecture · Computer Science 2025-09-24 Hanchen Ye , Deming Chen

Tensor processing infrastructures such as deep learning frameworks and specialized hardware accelerators have revolutionized how computationally intensive code from domains such as deep learning and image processing is executed and…

Programming Languages · Computer Science 2024-12-17 Jie Qiu , Colin Cai , Sahil Bhatia , Niranjan Hasabnis , Sanjit A. Seshia , Alvin Cheung

High-performance tensor programs are crucial to guarantee efficient execution of deep neural networks. However, obtaining performant tensor programs for different operators on various hardware platforms is notoriously challenging.…

We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective…

Machine Learning · Computer Science 2019-01-10 Tianqi Chen , Lianmin Zheng , Eddie Yan , Ziheng Jiang , Thierry Moreau , Luis Ceze , Carlos Guestrin , Arvind Krishnamurthy

Tensor algebra is a crucial component for data-intensive workloads such as machine learning and scientific computing. As the complexity of data grows, scientists often encounter a dilemma between the highly specialized dense tensor algebra…

Programming Languages · Computer Science 2024-07-19 Mahdi Ghorbani , Emilien Bauer , Tobias Grosser , Amir Shaikhha

Dense and sparse tensors allow the representation of most bulk data structures in computational science applications. We show that sparse tensor algebra can also be used to express many of the transformations on these datasets, especially…

Mathematical Software · Computer Science 2015-12-02 Edgar Solomonik , Torsten Hoefler

Over the past few years, the explosion in sparse tensor algebra workloads has led to a corresponding rise in domain-specific accelerators to service them. Due to the irregularity present in sparse tensors, these accelerators employ a wide…

Hardware Architecture · Computer Science 2024-06-13 Nandeeka Nayak , Toluwanimi O. Odemuyiwa , Shubham Ugare , Christopher W. Fletcher , Michael Pellauer , Joel S. Emer

Spatial dataflow accelerators are a promising direction for next-generation computer systems because they can reduce the memory bottlenecks of traditional von Neumann machines such as CPUs and GPUs. They organize computation around…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-13 Wei Li , Zhenyu Bai , Heru Wang , Pranav Dangi , Zhiqiang Zhang , Cheng Tan , Huiying Lan , Weng-Fai Wong , Tulika Mitra

This paper shows how to generate efficient tensor algebra code that compute on dynamic sparse tensors, which have sparsity structures that evolve over time. We propose a language for precisely specifying recursive, pointer-based data…

Mathematical Software · Computer Science 2021-12-03 Stephen Chou , Saman Amarasinghe
‹ Prev 1 2 3 10 Next ›