English
Related papers

Related papers: Locally-Oriented Programming: A Simple Programming…

200 papers

In this era of diverse and heterogeneous computer architectures, the programmability issues, such as productivity and portable efficiency, are crucial to software development and algorithm design. One way to approach the problem is to step…

Mathematical Software · Computer Science 2012-07-10 Mauro Bianco , Ugo Varetto

Stencils represent a class of computational patterns where an output grid point depends on a fixed shape of neighboring points in an input grid. Stencil computations are prevalent in scientific applications engaging a significant portion of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-24 Jesmin Jahan Tithi , Fabrizio Petrini , Hongbo Rong , Andrei Valentin , Carl Ebeling

Accelerated computing is widely used in high-performance computing. Therefore, it is crucial to experiment and discover how to better utilize GPUGPUs latest generations on relevant applications. In this paper, we present results and share…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-13 Baodi Shan , Mauricio Araya-Polo

Spatial computing devices have been shown to significantly accelerate stencil computations, but have so far relied on unrolling the iterative dimension of a single stencil operation to increase temporal locality. This work considers the…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-12 Johannes de Fine Licht , Andreas Kuster , Tiziano De Matteis , Tal Ben-Nun , Dominic Hofer , Torsten Hoefler

The pervasive adoption of Deep Learning (DL) and Graph Processing (GP) makes it a de facto requirement to build large-scale clusters of heterogeneous accelerators including GPUs and FPGAs. The OpenCL programming framework can be used on the…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-19 Yao Chen , Xin Long , Jiong He , Yuhang Chen , Hongshi Tan , Zhenxiang Zhang , Marianne Winslett , Deming Chen

The challenges associated with effectively programming FPGAs have been a major blocker in popularising reconfigurable architectures for HPC workloads. However new compiler technologies, such as MLIR, are providing new capabilities which…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-04 Gabriel Rodriguez-Canal , Nick Brown , Maurice Jamieson , Emilien Bauer , Anton Lydike , Tobias Grosser

Traditional deep network training methods optimize a monolithic objective function jointly for all the components. This can lead to various inefficiencies in terms of potential parallelization. Local learning is an approach to…

Machine Learning · Computer Science 2023-01-19 Adeetya Patel , Michael Eickenberg , Eugene Belilovsky

Deep neural networks are a promising solution for applications that solve problems based on learning data sets. DNN accelerators solve the processing bottleneck as a domain-specific processor. Like other hardware solutions, there must be…

Hardware Architecture · Computer Science 2022-11-08 Midia Reshadi , David Gregg

New algorithms and optimization techniques are needed to balance the accelerating trend towards bandwidth-starved multicore chips. It is well known that the performance of stencil codes can be improved by temporal blocking, lessening the…

Performance · Computer Science 2012-03-01 Markus Wittmann , Georg Hager , Gerhard Wellein

We advocate the Loop-of-stencil-reduce pattern as a means of simplifying the implementation of data-parallel programs on heterogeneous multi-core platforms. Loop-of-stencil-reduce is general enough to subsume map, reduce, map-reduce,…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-09-16 M. Aldinucci , M. Danelutto , M. Drocco , P. Kilpatrick , C. Misale , G. Peretti Pezzi , M. Torquati

Finite-difference methods based on high-order stencils are widely used in seismic simulations, weather forecasting, computational fluid dynamics, and other scientific applications. Achieving HPC-level stencil computations on one…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-09 Ryuichi Sai , John Mellor-Crummey , Jinfan Xu , Mauricio Araya-Polo

OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-23 Pekka Jääskeläinen , Carlos Sánchez de La Lama , Erik Schnetter , Kalle Raiskila , Jarmo Takala , Heikki Berg

Recently, several claims have been made that certain fundamental problems of distributed computing, including Leader Election and Distributed Consensus, begin to admit feasible and efficient solutions when the model of distributed…

Quantum Physics · Physics 2009-03-09 Cyril Gavoille , Adrian Kosowski , Marcin Markiewicz

Multiple Kernel Learning, or MKL, extends (kernelized) SVM by attempting to learn not only a classifier/regressor but also the best kernel for the training task, usually from a combination of existing kernel functions. Most MKL methods seek…

Machine Learning · Computer Science 2016-03-07 John Moeller , Sarathkrishna Swaminathan , Suresh Venkatasubramanian

Current architectures are now equipped with matrix computation units designed to enhance AI and high-performance computing applications. Within these architectures, two fundamental instruction types are matrix multiplication and vector…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-03-04 Wenxuan Zhao , Liang Yuan , Baicheng Yan , Penghao Ma , Yunquan Zhang , Long Wang , Zhe Wang

Stencil computations are a key class of applications, widely used in the scientific computing community, and a class that has particularly benefited from performance improvements on architectures with high memory bandwidth. Unfortunately,…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-10-27 Istvan Z Reguly , Gihan R Mudalige , Michael B Giles

We propose a novel structured discriminative block-diagonal dictionary learning method, referred to as scalable Locality-Constrained Projective Dictionary Learning (LC-PDL), for efficient representation and classification. To improve the…

Computer Vision and Pattern Recognition · Computer Science 2019-05-28 Zhao Zhang , Weiming Jiang , Zheng Zhang , Sheng Li , Guangcan Liu , Jie Qin

We propose a localized approach to multiple kernel learning that can be formulated as a convex optimization problem over a given cluster structure. For which we obtain generalization error guarantees and derive an optimization algorithm…

Machine Learning · Computer Science 2016-10-14 Yunwen Lei , Alexander Binder , Ürün Dogan , Marius Kloft

Modern distributed computation infrastructures are often plagued by unavailabilities such as failing or slow servers. These unavailabilities adversely affect the tail latency of computation in distributed infrastructures. The simple…

Information Theory · Computer Science 2020-02-07 Michael Rudow , K. V. Rashmi , Venkatesan Guruswami

The LOCAL model is among the main models for studying locality in the framework of distributed network computing. This model is however subject to pertinent criticisms, including the facts that all nodes wake up simultaneously, perform in…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-12-09 Carole Delporte-Gallet , Hugues Fauconnier , Pierre Fraigniaud , Mikaël Rabie
‹ Prev 1 2 3 10 Next ›