English
Related papers

Related papers: A Method for Efficient Heterogeneous Parallel Comp…

200 papers

Heterogeneous many-cores are now an integral part of modern computing systems ranging from embedding systems to supercomputers. While heterogeneous many-core design offers the potential for energy-efficient high-performance, such potential…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-11 Jianbin Fang , Chun Huang , Tao Tang , Zheng Wang

In recent times, the emergence of Large Language Models (LLMs) has resulted in increasingly larger model size, posing challenges for inference on low-resource devices. Prior approaches have explored offloading to facilitate low-memory…

Performance · Computer Science 2024-03-05 Xuanlei Zhao , Bin Jia , Haotian Zhou , Ziming Liu , Shenggan Cheng , Yang You

Recent advancements in large language models (LLMs) necessitate extensive computational resources, prompting the use of diverse hardware accelerators from multiple vendors. However, traditional distributed training frameworks struggle to…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-26 Ding Tang , Jiecheng Zhou , Jiakai Hu , Shengwei Li , Huihuang Zheng , Zhilin Pei , Hui Wang , Xingcheng Zhang

Modern computing systems increasingly rely on composing heterogeneous devices to improve performance and efficiency. Programming these systems is often unproductive: algorithm implementations must be coupled to system-specific logic,…

Programming Languages · Computer Science 2025-03-17 Russel Arbore , Aaron Councilman , Xavier Routh , Ryan Ziegler , Praneet Rathi , Vikram Adve

Heterogeneous multi core processors can offer diverse computing capabilities. The efficiency of Market Basket Analysis Algorithm can be improved with heterogeneous multi core processors. Market basket analysis algorithm utilises apriori…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-09-24 Aashiha Priyadarshni. L

The emergence of machine learning, image and audio processing on edge devices has motivated research towards power efficient custom hardware accelerators. Though FPGAs are an ideal target for energy efficient custom accelerators, the…

Hardware Architecture · Computer Science 2021-03-02 Kingshuk Majumder , Uday Bondhugula

This paper addresses the problem of parallelizing computations to study non-linear dynamics in large networks of non-locally coupled oscillators using heterogeneous computing resources. The proposed approach can be applied to a variety of…

Chaotic Dynamics · Physics 2025-07-04 Oleksandr Sudakov , Volodymyr Maistrenko

A new generation of manycore processors is on the rise that offers dozens and more cores on a chip and, in a sense, fuses host processor and accelerator. In this paper we target the efficient training of generalized linear models on these…

Performance · Computer Science 2021-10-29 Eliza Wszola , Celestine Mendler-Dünner , Martin Jaggi , Markus Püschel

To increase performance and efficiency, systems use FPGAs as reconfigurable accelerators. A key challenge in designing these systems is partitioning computation between processors and an FPGA. An appropriate division of labor may be…

Hardware Architecture · Computer Science 2021-07-21 Endri Bezati , Mahyar Emami , Jörn Janneck , James Larus

Homomorphic encryption (HE) enables arithmetic operations to be performed directly on encrypted data. It is essential for privacy-preserving applications such as machine learning, medical diagnosis, and financial data analysis. In popular…

Cryptography and Security · Computer Science 2026-05-19 Sajjad Akherati , Xinmiao Zhang

Heterogeneity is omnipresent in today's commodity computational systems, which comprise at least one multi-core Central Processing Unit (CPU) and one Graphics Processing Unit (GPU). Nonetheless, all this computing power is not being…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-02-18 Hervé Paulino , Eduardo Marques

In recent years, heterogeneous computing has emerged as the vital way to increase computers? performance and energy efficiency by combining diverse hardware devices, such as Graphics Processing Units (GPUs) and Field Programmable Gate…

Programming Languages · Computer Science 2020-11-02 Michail Papadimitriou , Juan Fumero , Athanasios Stratikopoulos , Foivos S. Zakkak , Christos Kotselidis

Hyperdimensional Computing (HDC) is a bio-inspired computing framework that has gained increasing attention, especially as a more efficient approach to machine learning (ML). This work introduces the \name{} compiler, the first open-source…

Machine Learning · Computer Science 2023-04-26 Pere Vergés , Mike Heddes , Igor Nunes , Tony Givargis , Alexandru Nicolau

In large-scale distributed computing clusters, such as Amazon EC2, there are several types of "system noise" that can result in major degradation of performance: bottlenecks due to limited communication bandwidth, latency due to straggler…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-21 Amirhossein Reisizadeh , Saurav Prakash , Ramtin Pedarsani , Amir Salman Avestimehr

To take full advantage of a specific hardware target, performance engineers need to gain control on compilers in order to leverage their domain knowledge about the program and hardware. Yet, modern compilers are poorly controlled, usually…

Programming Languages · Computer Science 2024-09-10 Martin Paul Lücke , Oleksandr Zinenko , William S. Moses , Michel Steuwer , Albert Cohen

As quantum computers continue to improve and support larger, more complex computations, smart control hardware and compilers are needed to efficiently leverage the capabilities of these systems. This paper introduces a novel approach to…

Quantum Physics · Physics 2025-11-19 Folkert de Ronde , Alexander Knapen , Stephan Wong , Sebastian Feld

This thesis (extended abstract) presents the software development efforts toward efficient exploitation of heterogeneity through intricate mapping of computational kernels, collaborative execution of multiple processing elements and…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-26 Siqi Wang

AI kernel compilation for edge devices depends on the compiler's ability to exploit parallelism and hide memory latency in the presence of hierarchical memory and explicit data movement. This paper reports a benchmark methodology and…

Programming Languages · Computer Science 2026-02-25 Javed Absar , Samarth Narang , Muthu Baskaran

Modern research in code generators for dense linear algebra computations has shown the ability to produce optimized code with a performance which compares and often exceeds the one of state-of-the-art implementations by domain experts.…

Programming Languages · Computer Science 2022-08-23 Lorenzo Chelini , Henrik Barthels , Paolo Bientinesi , Marcin Copik , Tobias Grosser , Daniele G. Spampinato

This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building…

‹ Prev 1 2 3 10 Next ›