English
Related papers

Related papers: Ansor: Generating High-Performance Tensor Programs…

200 papers

We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective…

Machine Learning · Computer Science 2019-01-10 Tianqi Chen , Lianmin Zheng , Eddie Yan , Ziheng Jiang , Thierry Moreau , Luis Ceze , Carlos Guestrin , Arvind Krishnamurthy

High-performance deep learning depends on efficient tensor programs. In recent years, automatic tensor program optimization, also known as tensor compilation, has emerged as the primary approach to generating efficient tensor programs.…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-18 Hangda Liu , Boyu Diao , Yu Yang , Wenxin Chen , Xiaohui Peng , Yongjun Xu

Auto-scheduling for tensor programs is a process where a search algorithm automatically explores candidate schedules (program transformations) for a given program on a target hardware platform to improve its performance. However this can be…

Machine Learning · Computer Science 2022-09-08 Perry Gibson , José Cano

Tensor networks provide a powerful framework for compressing multi-dimensional data. The optimal tensor network structure for a given data tensor depends on both data characteristics and specific optimality criteria, making tensor network…

Computational Engineering, Finance, and Science · Computer Science 2026-03-23 Zheng Guo , Aditya Deshpande , Brian Kiedrowski , Xinyu Wang , Alex Gorodetsky

Machine-learning models consist of kernels, which are algorithms applying operations on tensors -- data indexed by a linear combination of natural numbers. Examples of kernels include convolutions, transpositions, and vectorial products.…

Machine Learning · Computer Science 2024-07-16 Michael Canesche , Gaurav Verma , Fernando Magno Quintao Pereira

Accurate hardware performance models are critical to efficient code generation. They can be used by compilers to make heuristic decisions, by superoptimizers as a minimization objective, or by autotuners to find an optimal configuration for…

Tensor methods have become a promising tool to solve high-dimensional problems in the big data era. By exploiting possible low-rank tensor factorization, many high-dimensional model-based or data-driven problems can be solved to facilitate…

Optimization and Control · Mathematics 2019-08-22 Chunfeng Cui , Cole Hawkins , Zheng Zhang

Tensors, or multidimensional arrays, are data structures that can naturally represent visual data of multiple dimensions. Inherently able to efficiently capture structured, latent semantic spaces and high-order interactions, tensors have a…

Computer Vision and Pattern Recognition · Computer Science 2021-07-09 Yannis Panagakis , Jean Kossaifi , Grigorios G. Chrysos , James Oldfield , Mihalis A. Nicolaou , Anima Anandkumar , Stefanos Zafeiriou

Tensor program tuning is a non-convex objective optimization problem, to which search-based approaches have proven to be effective. At the core of the search-based approaches lies the design of the cost model. Though deep learning-based…

Machine Learning · Computer Science 2022-11-23 Yi Zhai , Yu Zhang , Shuo Liu , Xiaomeng Chu , Jie Peng , Jianmin Ji , Yanyong Zhang

Deploying deep learning models on various devices has become an important topic. The wave of hardware specialization brings a diverse set of acceleration primitives for multi-dimensional tensor computations. These new acceleration…

Machine Learning · Computer Science 2022-10-31 Siyuan Feng , Bohan Hou , Hongyi Jin , Wuwei Lin , Junru Shao , Ruihang Lai , Zihao Ye , Lianmin Zheng , Cody Hao Yu , Yong Yu , Tianqi Chen

Tuning tensor program generation involves searching for various possible program transformation combinations for a given program on target hardware to optimize the tensor program execution. It is already a complex process because of the…

Programming Languages · Computer Science 2023-12-29 Gaurav Verma , Siddhisanket Raskar , Zhen Xie , Abid M Malik , Murali Emani , Barbara Chapman

Deep Neural Networks (DNNs) have shown excellent performance in a wide range of machine learning applications. Knowing the latency of running a DNN model or tensor program on a specific device is useful in various tasks, such as DNN graph-…

Machine Learning · Computer Science 2023-11-20 Hanpeng Hu , Junwei Su , Juntao Zhao , Yanghua Peng , Yibo Zhu , Haibin Lin , Chuan Wu

In this paper, we present a novel technique to search for hardware architectures of accelerators optimized for end-to-end training of deep neural networks (DNNs). Our approach addresses both single-device and distributed pipeline and tensor…

Hardware Architecture · Computer Science 2024-04-24 Muhammad Adnan , Amar Phanishayee , Janardhan Kulkarni , Prashant J. Nair , Divya Mahajan

High-order tensor decomposition has been widely adopted to obtain compact deep neural networks for edge deployment. However, existing studies focus primarily on its algorithmic advantages such as accuracy and compression ratio-while…

Hardware Architecture · Computer Science 2025-11-26 Jinsong Zhang , Minghe Li , Jiayi Tian , Jinming Lu , Zheng Zhang

Deep learning models with convolutional and recurrent networks are now ubiquitous and analyze massive amounts of audio, image, video, text and graph data, with applications in automatic translation, speech-to-text, scene understanding,…

Deep learning has enabled major advances in the fields of computer vision, natural language processing, and multimedia among many others. Developing a deep learning system is arduous and complex, as it involves constructing neural network…

Machine Learning · Computer Science 2017-08-04 Hao Dong , Akara Supratak , Luo Mai , Fangde Liu , Axel Oehmichen , Simiao Yu , Yike Guo

Differentiable programming is a fresh programming paradigm which composes parameterized algorithmic components and trains them using automatic differentiation (AD). The concept emerges from deep learning but is not only limited to training…

Strongly Correlated Electrons · Physics 2019-09-11 Hai-Jun Liao , Jin-Guo Liu , Lei Wang , Tao Xiang

With the popularity of deep learning, the hardware implementation platform of deep learning has received increasing interest. Unlike the general purpose devices, e.g., CPU, or GPU, where the deep learning algorithms are executed at the…

Machine Learning · Computer Science 2022-11-22 Lang Feng , Wenjian Liu , Chuliang Guo , Ke Tang , Cheng Zhuo , Zhongfeng Wang

Deep neural networks (DNNs) are of critical use in different domains. To accelerate DNN computation, tensor compilers are proposed to generate efficient code on different domain-specific accelerators. Existing tensor compilers mainly focus…

Machine Learning · Computer Science 2023-07-12 Zixuan Ma , Haojie Wang , Jingze Xing , Liyan Zheng , Chen Zhang , Huanqi Cao , Kezhao Huang , Shizhi Tang , Penghan Wang , Jidong Zhai

Tensor program tuning is essential for the efficient deployment of deep neural networks. Search-based approaches have demonstrated scalability and effectiveness in automatically finding high-performance programs for specific hardware.…

Machine Learning · Computer Science 2025-04-10 Liang Qiao , Jun Shi , Xiaoyu Hao , Xi Fang , Sen Zhang , Minfan Zhao , Ziqi Zhu , Junshi Chen , Hong An , Xulong Tang , Bing Li , Honghui Yuan , Xinyang Wang
‹ Prev 1 2 3 10 Next ›