English
Related papers

Related papers: Thread Progress Equalization: Dynamically Adaptive…

200 papers

In modern data centers, energy usage represents one of the major factors affecting operational costs. Power capping is a technique that limits the power consumption of individual systems, which allows reducing the overall power demand at…

Performance · Computer Science 2017-09-05 Stefano Conoci , Pierangelo Di Sanzo , Bruno Ciciani , Francesco Quaglia

According to the increasing complexity of network application and internet traffic, network processor as a subset of embedded processors have to process more computation intensive tasks. By scaling down the feature size and emersion of chip…

Hardware Architecture · Computer Science 2012-04-13 Mehdi Alipour , Hojjat Taghdisi

Energy proportionality is the key design goal followed by architects of modern multicore CPUs. One of its implications is that optimization of an application for performance will also optimize it for energy. In this work, we show that…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-16 Semyon Khokhriakov , Ravi Reddy Manumachu , Alexey Lastovetsky

The aim of parallel computing is to increase an application performance by executing the application on multiple processors. OpenMP is an API that supports multi platform shared memory programming model and shared-memory programs are…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-11-12 Vibha Rajput , Alok Katiyar

To support growing massive parallelism, functional components and also the capabilities of current processors are changing and continue to do so. Todays computers are built upon multiple processing cores and run applications consisting of a…

Programming Languages · Computer Science 2016-04-07 Somnath Mazumdar , Roberto Giorgi

Massive multi-threading in GPU imposes tremendous pressure on memory subsystems. Due to rapid growth in thread-level parallelism of GPU and slowly improved peak memory bandwidth, the memory becomes a bottleneck of GPU's performance and…

Hardware Architecture · Computer Science 2019-06-17 Bing Li , Mengjie Mao , Xiaoxiao Liu , Tao Liu , Zihao Liu , Wujie Wen , Yiran Chen , Hai , Li

Optimal multiple sequence alignment by dynamic programming, like many highly dimensional scientific computing problems, has failed to benefit from the improvements in computing performance brought about by multi-processor systems, due to…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-30 Manal Helal , Hossam El-Gindy , Lenore Mullin , Bruno Gaeta

Heterogeneous processors with architecturally different cores (CPU and GPU) integrated on the same die lead to new challenges and opportunities for thermal and power management techniques because of shared thermal/power budgets between…

Hardware Architecture · Computer Science 2018-08-30 Kapil Dev , Indrani Paul , Wei Huang , Yasuko Eckert , Wayne Burleson , Sherief Reda

This article presents an automatic approach to quickly derive a good solution for hardware resource partition and task granularity for task-based parallel applications on heterogeneous many-core architectures. Our approach employs a…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-10 Peng Zhang , Jianbin Fang , Canqun Yang , Chun Huang , Tao Tang , Zheng Wang

To keep pace with Moore's law, chip designers have focused on increasing the number of cores per chip rather than single core performance. In turn, modern jobs are often designed to run on any number of cores. However, to effectively…

Performance · Computer Science 2017-11-02 Benjamin Berg , Jan-Pieter Dorsman , Mor Harchol-Balter

This work proposes a methodology to find performance and energy trade-offs for parallel applications running on Heterogeneous Multi-Processing systems with a single instruction-set architecture. These offer flexibility in the form of…

Heterogeneous many-cores are now an integral part of modern computing systems ranging from embedding systems to supercomputers. While heterogeneous many-core design offers the potential for energy-efficient high-performance, such potential…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-11 Jianbin Fang , Chun Huang , Tao Tang , Zheng Wang

Most modern operating systems have adopted the one-to-one thread model to support fast execution of threads in both multi-core and single-core systems. This thread model, which maps the kernel-space and user-space threads in a one-to-one…

Operating Systems · Computer Science 2021-01-21 Geunsik Lim , Donghyun Kang , Young Ik Eom

The arrival of heterogeneous (or hybrid) multicore architectures has brought new performance trade-offs for applications, and efficiency opportunities to systems. They have also increased the challenges related to thread scheduling, as…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-11 Yacine Idouar , Adrien Cassagne , Laércio Lima Pilla , Julien Sopena , Manuel Bouyer , Diane Orhan , Lionel Lacassagne , Dimitri Galayko , Denis Barthou , Christophe Jego

The dynamic adaptation of resource levels enables the system to enhance energy efficiency while maintaining the necessary computational resources, particularly in scenarios where workloads fluctuate significantly over time. The proposed…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-14 Said Muhammad , Lahlou Laaziz , Nadjia Kara , Phat Tan Nguyen , Timothy Murphy

In present study, in order to improve the performance and reduce the amount of power which is dissipated in heterogeneous multicore processors, the ability of detecting the program execution phases is investigated. The programs execution…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-01-14 A. Z. Jooya , M. Analoui

This paper presents refinements to the execution-cache-memory performance model and a previously published power model for multicore processors. The combination of both enables a very accurate prediction of performance and energy…

Performance · Computer Science 2018-07-09 Johannes Hofmann , Georg Hager , Dietmar Fey

Transactional Stream Processing Engines (TSPEs) form the backbone of modern stream applications handling shared mutable states. Yet, the full potential of these systems, specifically in exploiting parallelism and implementing dynamic…

Databases · Computer Science 2023-07-25 Yancan Mao , Jianjun Zhao , Zhonghao Yang , Shuhao Zhang , Haikun Liu , Volker Markl

Recent advances in multi and many-core processors have led to significant improvements in the performance of scientific computing applications. However, the addition of a large number of complex cores have also increased the overall power…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-23 Akash Dutta , Jee Choi , Ali Jannesari

In this paper we analyze, evaluate, and improve the performance of training generalized linear models on modern CPUs. We start with a state-of-the-art asynchronous parallel training algorithm, identify system-level performance bottlenecks,…

Machine Learning · Computer Science 2018-12-20 Nikolas Ioannou , Celestine Dünner , Kornilios Kourtis , Thomas Parnell
‹ Prev 1 2 3 10 Next ›