Related papers: Optimizing Tensor Programs on Flexible Storage
Sparse tensors are prevalent in many data-intensive applications, yet existing differentiable programming frameworks are tailored towards dense tensors. This presents a significant challenge for efficiently computing gradients through…
This paper shows how to generate code that efficiently converts sparse tensors between disparate storage formats (data layouts) such as CSR, DIA, ELL, and many others. We decompose sparse tensor conversion into three logical phases:…
Tensors are a natural way to express correlations among many physical variables, but storing tensors in a computer naively requires memory which scales exponentially in the rank of the tensor. This is not optimal, as the required memory is…
Tensor networks provide a powerful framework for compressing multi-dimensional data. The optimal tensor network structure for a given data tensor depends on both data characteristics and specific optimality criteria, making tensor network…
Tensor decompositions such as the canonical format and the tensor train format have been widely utilized to reduce storage costs and operational complexities for high-dimensional data, achieving linear scaling with the input dimension…
Researchers are increasingly incorporating numeric high-order data, i.e., numeric tensors, within their practice. Just like the matrix/vector (MV) paradigm, the development of multi-purpose, but high-performance, sparse data structures and…
State-of-the-art models are now trained with billions of parameters, reaching hardware limits in terms of memory consumption. This has created a recent demand for memory-efficient optimizers. To this end, we investigate the limits and…
This paper shows how to optimize sparse tensor algebraic expressions by introducing temporary tensors, called workspaces, into the resulting loop nests. We develop a new intermediate language for tensor operations called concrete index…
From FORTRAN to NumPy, tensors have revolutionized how we express computation. However, tensors in these, and almost all prominent systems, can only handle dense rectilinear integer grids. Real world tensors often contain underlying…
Symmetric tensor operations arise in a wide variety of computations. However, the benefits of exploiting symmetry in order to reduce storage and computation is in conflict with a desire to simplify memory access patterns. In this paper, we…
Automatic optimization for tensor programs becomes increasingly important as we deploy deep learning in various environments, and efficient optimization relies on a rich search space and effective search. Most existing efforts adopt a…
Sparsity, which occurs in both scientific applications and Deep Learning (DL) models, has been a key target of optimization within recent ASIC accelerators due to the potential memory and compute savings. These applications use data stored…
The ongoing trend of hardware specialization has led to a growing use of custom data formats when processing sparse workloads, which are typically memory-bound. These formats facilitate optimized software/hardware implementations by…
Tensor algebra lies at the core of computational science and machine learning. Due to its high usage, entire libraries exist dedicated to improving its performance. Conventional tensor algebra performance boosts focus on algorithmic…
This paper shows how to generate efficient tensor algebra code that compute on dynamic sparse tensors, which have sparsity structures that evolve over time. We propose a language for precisely specifying recursive, pointer-based data…
Tensors provide a robust framework for managing high-dimensional data. Consequently, tensor analysis has emerged as an active research area in various domains, including machine learning, signal processing, computer vision, graph analysis,…
We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective…
High-performance tensor programs are crucial to guarantee efficient execution of deep neural networks. However, obtaining performant tensor programs for different operators on various hardware platforms is notoriously challenging.…
We present an incremental, scalable and efficient dimension reduction technique for tensors that is based on sparse random linear coding. Data is stored in a compactified representation with fixed size, which makes memory requirements low…
Many real-world datasets are represented as tensors, i.e., multi-dimensional arrays of numerical values. Storing them without compression often requires substantial space, which grows exponentially with the order. While many tensor…