English
Related papers

Related papers: PolyScientist: Automatic Loop Transformations Comb…

200 papers

Deep Neural Networks (DNNs) have revolutionized many aspects of our lives. The use of DNNs is becoming ubiquitous including in softwares for image recognition, speech recognition, speech synthesis, language translation, to name a few. he…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-18 Sanket Tavarageri , Alexander Heinecke , Sasikanth Avancha , Gagandeep Goyal , Ramakrishna Upadrasta , Bharat Kaul

During the past decade, Deep Learning (DL) algorithms, programming systems and hardware have converged with the High Performance Computing (HPC) counterparts. Nevertheless, the programming methodology of DL and HPC systems is stagnant,…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-03-19 Evangelos Georganas , Dhiraj Kalamkar , Kirill Voronin , Abhisek Kundu , Antonio Noack , Hans Pabst , Alexander Breuer , Alexander Heinecke

Creating high performance implementations of deep learning primitives on CPUs is a challenging task. Multiple considerations including multi-level cache hierarchy, and wide SIMD units of CPU platforms influence the choice of program…

Programming Languages · Computer Science 2021-04-13 Sanket Tavarageri , Gagandeep Goyal , Sasikanth Avancha , Bharat Kaul , Ramakrishna Upadrasta

A commonly occurring computation idiom in neural networks is to perform some pointwise operations on the result of a matrix multiplication. Such a sequence of operations is typically represented as a computation graph in deep learning…

Programming Languages · Computer Science 2020-08-04 Somashekaracharya G. Bhaskaracharya , Julien Demouth , Vinod Grover

We present a library of efficient implementations of deep learning primitives. Deep learning workloads are computationally intensive, and optimizing their kernels is difficult and time-consuming. As parallel architectures evolve, kernels…

Neural and Evolutionary Computing · Computer Science 2014-12-19 Sharan Chetlur , Cliff Woolley , Philippe Vandermersch , Jonathan Cohen , John Tran , Bryan Catanzaro , Evan Shelhamer

Deep neural networks have recently achieved state of the art performance thanks to new training algorithms for rapid parameter estimation and new regularization methods to reduce overfitting. However, in practice the network architecture…

Machine Learning · Computer Science 2016-03-04 Minyoung Kim , Luca Rigazio

Deep learning methods have predominantly been applied to large artificial neural networks. Despite their state-of-the-art performance, these large networks typically do not generalize well to datasets with limited sample sizes. In this…

Machine Learning · Statistics 2016-11-17 Eric Strobl , Shyam Visweswaran

The rapidly evolving landscape of AI and machine learning workloads has widened the gap between high-level domain operations and efficient hardware utilization. Achieving near-peak performance still demands deep hardware expertise-experts…

Machine Learning · Computer Science 2025-11-19 Arun Thangamani , Md Asghar Ahmad Shahid , Adam Siemieniuk , Rolf Morel , Renato Golin , Alexander Heinecke

Deep learning (DL) is one of the most prominent branches of machine learning. Due to the immense computational cost of DL workloads, industry and academia have developed DL libraries with highly-specialized kernels for each…

Deep kernel learning aims at designing nonlinear combinations of multiple standard elementary kernels by training deep networks. This scheme has proven to be effective, but intractable when handling large-scale datasets especially when the…

Computer Vision and Pattern Recognition · Computer Science 2018-05-01 Mingyuan Jiu , Hichem Sahbi

Deploying deep learning models on various devices has become an important topic. The wave of hardware specialization brings a diverse set of acceleration primitives for multi-dimensional tensor computations. These new acceleration…

Machine Learning · Computer Science 2022-10-31 Siyuan Feng , Bohan Hou , Hongyi Jin , Wuwei Lin , Junru Shao , Ruihang Lai , Zihao Ye , Lianmin Zheng , Cody Hao Yu , Yong Yu , Tianqi Chen

In this paper, we demonstrate a compiler that can optimize sparse and recurrent neural networks, both of which are currently outside of the scope of existing neural network compilers (sparse neural networks here stand for networks that can…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-11 Riyadh Baghdadi , Abdelkader Nadir Debbagh , Kamel Abdous , Fatima Zohra Benhamida , Alex Renda , Jonathan Elliott Frankle , Michael Carbin , Saman Amarasinghe

Automated tuning of compute kernels is a popular area of research, mainly focused on finding optimal kernel parameters for a problem with fixed input sizes. This approach is good for deploying machine learning models, where the network…

Machine Learning · Computer Science 2020-03-17 John Lawson

State of the art deep learning models have made steady progress in the fields of computer vision and natural language processing, at the expense of growing model sizes and computational complexity. Deploying these models on low power and…

Machine Learning · Computer Science 2018-10-29 Meghan Cowan , Thierry Moreau , Tianqi Chen , Luis Ceze

Optimizing deep learning models is generally performed in two steps: (i) high-level graph optimizations such as kernel fusion and (ii) low level kernel optimizations such as those found in vendor libraries. This approach often leaves…

Machine Learning · Computer Science 2021-03-08 Pratik Fegade , Tianqi Chen , Phillip B. Gibbons , Todd C. Mowry

In this paper, we present a work in progress about a deep learning based approach for automatic code optimization in polyhedral compilers. The proposed technique explores combinations of affine and non-affine loop transformations to find…

McKernel introduces a framework to use kernel approximates in the mini-batch setting with Stochastic Gradient Descent (SGD) as an alternative to Deep Learning. Based on Random Kitchen Sinks [Rahimi and Recht 2007], we provide a C++ library…

Machine Learning · Computer Science 2020-04-20 J. D. Curtó , I. C. Zarza , Feng Yang , Alex Smola , Fernando de la Torre , Chong Wah Ngo , Luc van Gool

Kernel fusion is a popular and effective approach for combining multiple features that characterize different aspects of data. Traditional approaches for Multiple Kernel Learning (MKL) attempt to learn the parameters for combining the…

Advances in high-throughput technologies have originated an ever-increasing availability of omics datasets. The integration of multiple heterogeneous data sources is currently an issue for biology and bioinformatics. Multiple kernel…

Machine Learning · Statistics 2024-12-04 Mitja Briscik , Gabriele Tazza , Marie-Agnes Dillies , László Vidács , Sébastien Dejean

Multiple Kernel Learning, or MKL, extends (kernelized) SVM by attempting to learn not only a classifier/regressor but also the best kernel for the training task, usually from a combination of existing kernel functions. Most MKL methods seek…

Machine Learning · Computer Science 2016-03-07 John Moeller , Sarathkrishna Swaminathan , Suresh Venkatasubramanian
‹ Prev 1 2 3 10 Next ›