Related papers: Automated Parallel Kernel Extraction from Dynamic …

Automatic Parallelization: Executing Sequential Programs on a Task-Based Parallel Runtime

There are billions of lines of sequential code inside nowadays' software which do not benefit from the parallelism available in modern multicore architectures. Automatically parallelizing sequential code, to promote an efficient use of the…

Programming Languages · Computer Science 2016-04-13 Alcides Fonseca , Bruno Cabral , João Rafael , Ivo Correia

Scalable Kernelization for Maximum Independent Sets

The most efficient algorithms for finding maximum independent sets in both theory and practice use reduction rules to obtain a much smaller problem instance called a kernel. The kernel can then be solved quickly using exact or heuristic…

Data Structures and Algorithms · Computer Science 2019-09-11 Demian Hespe , Christian Schulz , Darren Strash

Kernelization of Discrete Optimization Problems on Parallel Architectures

There are existing standard solvers for tackling discrete optimization problems. However, in practice, it is uncommon to apply them directly to the large input space typical of this class of problems. Rather, the input is preprocessed to…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-02 Bolarinwa Olayemi Saheed

Parallel Tree Kernel Computation

Tree kernels are fundamental tools that have been leveraged in many applications, particularly those based on machine learning for Natural Language Processing tasks. In this paper, we devise a parallel implementation of the sequential…

Computation and Language · Computer Science 2023-05-16 Souad Taouti , Hadda Cherroun , Djelloul Ziadi

Performance portability through machine learning guided kernel selection in SYCL libraries

Automatically tuning parallel compute kernels allows libraries and frameworks to achieve performance on a wide range of hardware, however these techniques are typically focused on finding optimal kernel parameters for particular input sizes…

Performance · Computer Science 2020-09-01 John Lawson

Discovering Software Parallelization Points Using Deep Neural Networks

This study proposes a deep learning-based approach for discovering loops in programming code according to their potential for parallelization. Two genetic algorithm-based code generators were developed to produce two distinct types of code:…

Machine Learning · Computer Science 2025-10-03 Izavan dos S. Correia , Henrique C. T. Santos , Tiago A. E. Ferreira

A Benchmark Set of Highly-efficient CUDA and OpenCL Kernels and its Dynamic Autotuning with Kernel Tuning Toolkit

Autotuning of performance-relevant source-code parameters allows to automatically tune applications without hard coding optimizations and thus helps with keeping the performance portable. In this paper, we introduce a benchmark set of ten…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-02 Filip Petrovič , David Střelák , Jana Hozzová , Jaroslav Oľha , Richard Trembecký , Siegfried Benkner , Jiří Filipovič

Designing and developing tools to automatically identify parallelism

In this work we present a dynamic analysis tool for analyzing regions of code and how those regions depend between each other via data dependencies encountered during the execution of the program. We also present an abstract method to…

Software Engineering · Computer Science 2022-08-08 Fabian Mora Cordero

New efficient algorithms for multiple change-point detection with kernels

Several statistical approaches based on reproducing kernels have been proposed to detect abrupt changes arising in the full distribution of the observations and not only in the mean or variance. Some of these approaches enjoy good…

Statistics Theory · Mathematics 2017-10-13 Alain Celisse , Guillemette Marot , Morgane Pierre-Jean , Guillem Rigaill

Towards automated kernel selection in machine learning systems: A SYCL case study

Automated tuning of compute kernels is a popular area of research, mainly focused on finding optimal kernel parameters for a problem with fixed input sizes. This approach is good for deploying machine learning models, where the network…

Machine Learning · Computer Science 2020-03-17 John Lawson

Performance Analysis of Traditional and Data-Parallel Primitive Implementations of Visualization and Analysis Kernels

Measurements of absolute runtime are useful as a summary of performance when studying parallel visualization and analysis methods on computational platforms of increasing concurrency and complexity. We can obtain even more insights by…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-07 E. Wes Bethel , David Camp , Talita Perciano , Colleen Heinemann

Parallelism detection using graph labelling

Usage of multiprocessor and multicore computers implies parallel programming. Tools for preparing parallel programs include parallel languages and libraries as well as parallelizing compilers and convertors that can perform automatic…

Mathematical Software · Computer Science 2022-12-12 Pavel Telegin , Anton Baranov , Boris Shabanov , Artem Tikhomirov

Accelerating Local Search for the Maximum Independent Set Problem

Computing high-quality independent sets quickly is an important problem in combinatorial optimization. Several recent algorithms have shown that kernelization techniques can be used to find exact maximum independent sets in medium-sized…

Data Structures and Algorithms · Computer Science 2016-02-05 Jakob Dahlum , Sebastian Lamm , Peter Sanders , Christian Schulz , Darren Strash , Renato F. Werneck

A fast PC algorithm for high dimensional causal discovery with multi-core PCs

Discovering causal relationships from observational data is a crucial problem and it has applications in many research areas. The PC algorithm is the state-of-the-art constraint based method for causal discovery. However, runtime of the PC…

Artificial Intelligence · Computer Science 2016-11-11 Thuc Duy Le , Tao Hoang , Jiuyong Li , Lin Liu , Huawen Liu

Computing Kernels in Parallel: Lower and Upper Bounds

Parallel fixed-parameter tractability studies how parameterized problems can be solved in parallel. A surprisingly large number of parameterized problems admit a high level of parallelization, but this does not mean that we can also…

Computational Complexity · Computer Science 2018-07-11 Max Bannach , Till Tantau

Temporal Parallelisation of Dynamic Programming and Linear Quadratic Control

This paper proposes a general formulation for temporal parallelisation of dynamic programming for optimal control problems. We derive the elements and associative operators to be able to use parallel scans to solve these problems with…

Optimization and Control · Mathematics 2022-01-25 Simo Särkkä , Ángel F. García-Fernández

Automatic Tracing in Task-Based Runtime Systems

Implicitly parallel task-based runtime systems often perform dynamic analysis to discover dependencies in and extract parallelism from sequential programs. Dependence analysis becomes expensive as task granularity drops below a threshold.…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-17 Rohan Yadav , Michael Bauer , David Broman , Michael Garland , Alex Aiken , Fredrik Kjolstad

Parallel Local Graph Clustering

Graph clustering has many important applications in computing, but due to growing sizes of graphs, even traditionally fast clustering methods such as spectral partitioning can be computationally expensive for real-world graphs of interest.…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-11 Julian Shun , Farbod Roosta-Khorasani , Kimon Fountoulakis , Michael W. Mahoney

On Improving Deep Learning Trace Analysis with System Call Arguments

Kernel traces are sequences of low-level events comprising a name and multiple arguments, including a timestamp, a process id, and a return value, depending on the event. Their analysis helps uncover intrusions, identify bugs, and find…

Machine Learning · Computer Science 2021-03-15 Quentin Fournier , Daniel Aloise , Seyed Vahid Azhari , François Tetreault

Alignment Based Kernel Learning with a Continuous Set of Base Kernels

The success of kernel-based learning methods depend on the choice of kernel. Recently, kernel learning methods have been proposed that use data to select the most appropriate kernel, usually by combining a set of base kernels. We introduce…

Machine Learning · Computer Science 2011-12-21 Arash Afkanpour , Csaba Szepesvari , Michael Bowling