English
Related papers

Related papers: Performance portability through machine learning g…

200 papers

Automated tuning of compute kernels is a popular area of research, mainly focused on finding optimal kernel parameters for a problem with fixed input sizes. This approach is good for deploying machine learning models, where the network…

Machine Learning · Computer Science 2020-03-17 John Lawson

Over recent years heterogeneous systems have become more prevalent across HPC systems, with over 100 supercomputers in the TOP500 incorporating GPUs or other accelerators. These hardware platforms have different performance characteristics…

Performance · Computer Science 2019-04-11 John Lawson , Mehdi Goli , Duncan McBain , Daniel Soutar , Louis Sugy

The performance portability of OpenCL kernel implementations for common memory bandwidth limited linear algebra operations across different hardware generations of the same vendor as well as across vendors is studied. Certain combinations…

Mathematical Software · Computer Science 2022-11-03 Karl Rupp , Philippe Tillet , Florian Rudolf , Josef Weinbub , Tibor Grasser , Ansgar Jüngel

Heterogeneous computing, which combines devices with different architectures, is rising in popularity, and promises increased performance combined with reduced energy consumption. OpenCL has been proposed as a standard for programing such…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-15 Thomas L. Falch , Anne C. Elster

Linux kernel is a huge code base with enormous number of subsystems and possible configuration options that results in unmanageable complexity of elaborating an efficient configuration. Machine Learning (ML) is approach/area of learning…

Machine Learning · Computer Science 2026-03-03 Viacheslav Dubeyko

High-performance computing (HPC) applications are increasingly executed in heterogeneous environments, introducing new challenges for programming and software portability. SYCL has emerged as a leading model designed to simplify…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-20 Ami Marowka

Hand-optimizing linear algebra kernels for different GPU devices and applications is complex and labor-intensive. Instead, many developers use automatic performance tuning (autotuning) to achieve high performance on a variety of devices.…

Programming Languages · Computer Science 2025-07-22 Robert Hochgraf , Sreepathi Pai

The high-performance computing (HPC) landscape is undergoing rapid transformation, with an increasing emphasis on energy-efficient and heterogeneous computing environments. This comprehensive study extends our previous research on SYCL's…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-15 Manuel Costanzo , Enzo Rucci , Carlos García-Sánchez , Marcelo Naiouf , Manuel Prieto-Matías

Autotuning of performance-relevant source-code parameters allows to automatically tune applications without hard coding optimizations and thus helps with keeping the performance portable. In this paper, we introduce a benchmark set of ten…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-02 Filip Petrovič , David Střelák , Jana Hozzová , Jaroslav Oľha , Richard Trembecký , Siegfried Benkner , Jiří Filipovič

In order to fully utilize "big data", it is often required to use "big models". Such models tend to grow with the complexity and size of the training data, and do not make strong parametric assumptions upfront on the nature of the…

Machine Learning · Statistics 2015-04-17 Vikas Sindhwani , Haim Avron

Kernel methods have proven to be powerful techniques for pattern analysis and machine learning (ML) in a variety of domains. However, many of their original or advanced implementations remain in Matlab. With the incredible rise and adoption…

Machine Learning · Computer Science 2020-05-28 Pradeep Reddy Raamana

Applying machine learning to biological sequences - DNA, RNA and protein - has enormous potential to advance human health, environmental sustainability, and fundamental biological understanding. However, many existing machine learning…

Machine Learning · Statistics 2023-04-11 Alan Nawzad Amin , Eli Nathan Weinstein , Debora Susan Marks

Transfer learning refers to the process of adapting a model trained on a source task to a target task. While kernel methods are conceptually and computationally simple machine learning models that are competitive on a variety of tasks, it…

Machine Learning · Computer Science 2022-11-02 Adityanarayanan Radhakrishnan , Max Ruiz Luyten , Neha Prasad , Caroline Uhler

As the interest in FPGA-based accelerators for HPC applications increases, new challenges also arise, especially concerning different programming and portability issues. This paper aims to provide a snapshot of the current state of the FPGA…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-06 Manuel de Castro , Francisco J. andújar , Roberto R. Osorio , Rocío Carratalá-Sáez , Diego R. Llanos

Multiple Kernel Learning, or MKL, extends (kernelized) SVM by attempting to learn not only a classifier/regressor but also the best kernel for the training task, usually from a combination of existing kernel functions. Most MKL methods seek…

Machine Learning · Computer Science 2016-03-07 John Moeller , Sarathkrishna Swaminathan , Suresh Venkatasubramanian

Optimizing GPU kernels presents a significantly greater challenge for large language models (LLMs) than standard code generation tasks, as it requires understanding hardware architecture, parallel optimization strategies, and performance…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-16 Nina Wiedemann , Quentin Leboutet , Michael Paulitsch , Diana Wofk , Benjamin Ummenhofer

The success of kernel-based learning methods depend on the choice of kernel. Recently, kernel learning methods have been proposed that use data to select the most appropriate kernel, usually by combining a set of base kernels. We introduce…

Machine Learning · Computer Science 2011-12-21 Arash Afkanpour , Csaba Szepesvari , Michael Bowling

OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-23 Pekka Jääskeläinen , Carlos Sánchez de La Lama , Erik Schnetter , Kalle Raiskila , Jarmo Takala , Heikki Berg

High-performance computing (HPC) is a major driver accelerating scientific research and discovery, from quantum simulations to medical therapeutics. While the increasing availability of HPC resources is in many cases pivotal to successful…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-28 Vincent R. Pascuzzi , Mehdi Goli

Multiple kernel learning (MKL) method is generally believed to perform better than single kernel method. However, some empirical studies show that this is not always true: the combination of multiple kernels may even yield an even worse…

Machine Learning · Statistics 2018-06-21 Zhao Kang , Xiao Lu , Jinfeng Yi , Zenglin Xu
‹ Prev 1 2 3 10 Next ›