Related papers: Multicore Dynamic Kernel Modules Attachment Techni…
Kernel fusion is a popular and effective approach for combining multiple features that characterize different aspects of data. Traditional approaches for Multiple Kernel Learning (MKL) attempt to learn the parameters for combining the…
Multiple kernel methods based on k-means aims to integrate a group of kernels to improve the performance of kernel k-means clustering. However, we observe that most existing multiple kernel k-means methods exploit the nonlinear relationship…
Current computational systems are heterogeneous by nature, featuring a combination of CPUs and GPUs. As the latter are becoming an established platform for high-performance computing, the focus is shifting towards the seamless programming…
With the advent of hundreds of cores on a chip to accelerate applications, the operating system (OS) needs to exploit the existing parallelism provided by the underlying hardware resources to determine the right amount of processes to be…
Monolithic operating systems, where all kernel functionality resides in a single, shared address space, are the foundation of most mainstream computer systems. However, a single flaw, even in a non-essential part of the kernel (e.g., device…
Modern high-performance computing architectures (Multicore, GPU, Manycore) are based on tightly-coupled clusters of processing elements, physically implemented as rectangular tiles. Their size and aspect ratio strongly impact the achievable…
There are increasing number of works addressing the design challenges of fast, scalable solutions for the growing number of new type of applications. Recently, many of the solutions aimed at improving processing element capabilities to…
Any architecture for practical quantum computing must be scalable. An attractive approach is to create multiple cores, computing regions of fixed size that are well-spaced but interlinked with communication channels. This exploded…
Dynamic convolution enhances model capacity by adaptively combining multiple kernels, yet faces critical trade-offs: prior works either (1) incur significant parameter overhead by scaling kernel numbers linearly, (2) compromise inference…
Emerging processor architectures such as GPUs and Intel MICs provide a huge performance potential for high performance computing. However developing software using these hardware accelerators introduces additional challenges for the…
We discuss an implementation of adaptive fast multipole methods targeting hybrid multicore CPU- and GPU-systems. From previous experiences with the computational profile of our version of the fast multipole algorithm, suitable parts are…
Multiple kernel learning (MKL) method is generally believed to perform better than single kernel method. However, some empirical studies show that this is not always true: the combination of multiple kernels may even yield an even worse…
Microprocessor roadmaps clearly show a trend towards multiple core CPUs. Modern operating systems already make use of these CPU architectures by distributing tasks between processing cores thereby increasing system performance. This review…
Smart devices see a large number of ephemeral tasks driven by background activities. In order to execute such a task, the OS kernel wakes up the platform beforehand and puts it back to sleep afterwards. In doing so, the kernel operates…
The latest trends in high-performance computing systems show an increasing demand on the use of a large scale multicore systems in a efficient way, so that high compute-intensive applications can be executed reasonably well. However, the…
Software-controlled heterogeneous memory systems have the potential to improve performance, efficiency, and cost tradeoffs in emerging systems. Delivering on this promise requires an efficient operating system (OS) mechanisms and policies…
It is becoming increasingly difficult to improve the performance of a a single process (thread) on a computer due to physical limitations. Modern systems use multi-core processors in which multiple processes (threads) may run concurrently.…
Light-weight convolutional neural networks (CNNs) suffer performance degradation as their low computational budgets constrain both the depth (number of convolution layers) and the width (number of channels) of CNNs, resulting in limited…
In this paper is proposed a technique to integrate and simulate a dynamic memory in a multiprocessor framework based on C/C++/SystemC. Using host machine's memory management capabilities, dynamic data processing is supported without…
Advances in high-throughput technologies have originated an ever-increasing availability of omics datasets. The integration of multiple heterogeneous data sources is currently an issue for biology and bioinformatics. Multiple kernel…