English
Related papers

Related papers: Cache-aware Parallel Programming for Manycore Proc…

200 papers

The emergence of multicore and manycore processors is set to change the parallel computing world. Applications are shifting towards increased parallelism in order to utilise these architectures efficiently. This leads to a situation where…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-04-01 Ashkan Tousimojarad , Wim Vanderbauwhede

Multi-core architectures feature an intricate hierarchy of cache memories, with multiple levels and sizes. To adequately decompose an application according to the traits of a particular memory hierarchy is a cumbersome task that may be…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-11-20 Hervé Paulino , Nuno Delgado

The current computer architecture has moved towards the multi/many-core structure. However, the algorithms in the current sequential dense numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multi/many-core…

Numerical Analysis · Computer Science 2013-03-14 Henricus Bouwmeester

Many computer systems for calculating the proper organization of memory are among the most critical issues. Using a tier cache memory (along with branching prediction) is an effective means of increasing modern multi-core processors'…

Networking and Internet Architecture · Computer Science 2021-05-21 Mohamed A. Hamada , Abdelrahman Abdallah

Processors with large numbers of cores are becoming commonplace. In order to take advantage of the available resources in these systems, the programming paradigm has to move towards increased parallelism. However, increasing the level of…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-10-07 Ashkan Tousimojarad , Wim Vanderbauwhede

Multicore architectures dominate today's processor market. Even though the number of cores and threads are pretty high and continues to grow, inherently serial algorithms do not benefit from the abundance of cores and threads. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-05-21 Mohammad Bakhshalipour , Hamid Sarbazi-Azad

The whole computer hardware industry embraced multicores. For these machines, the extreme optimisation of sequential algorithms is no longer sufficient to squeeze the real machine power, which can be only exploited via thread-level…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-09-21 Marco aldinucci , Salvatore Ruggieri , Massimo Torquati

Bandwidth-starved multicore chips have become ubiquitous. It is well known that the performance of stencil codes can be improved by temporal blocking, lessening the pressure on the memory interface. We introduce a new pipelined approach…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-06-17 Markus Wittmann , Georg Hager , Jan Treibig , Gerhard Wellein

The Simplex tableau has been broadly used and investigated in the industry and academia. With the advent of the big data era, ever larger problems are posed to be solved in ever larger machines whose architecture type did not exist in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-29 Demetrios Coutinho , Felipe O. Lins e Silva , Daniel Aloise , Samuel , Xavier-de-Souza

With the advent of hundreds of cores on a chip to accelerate applications, the operating system (OS) needs to exploit the existing parallelism provided by the underlying hardware resources to determine the right amount of processes to be…

Operating Systems · Computer Science 2025-01-07 Yao Xiao , Nikos Kanakaris , Anzhe Cheng , Chenzhong Yin , Nesreen K. Ahmed , Shahin Nazarian , Andrei Irimia , Paul Bogdan

Parallel computing is a standard approach to achieving high-performance computing (HPC). Three commonly used methods to implement parallel computing include: 1) applying multithreading technology on single-core or multi-core CPUs; 2)…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-18 Xinyao Yi

This paper presents the research work on multicore microcontrollers using parallel, and time critical programming for the embedded systems. Due to the high complexity and limitations, it is very hard to work on the application development…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-04-13 Prerna Saini , Ankit Bansal , Abhishek Sharma

In this paper we deal with the impact of multi and many-core processor architectures on simulation. Despite the fact that modern CPUs have an increasingly large number of cores, most softwares are still unable to take advantage of them. In…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-07-30 Gabriele D'Angelo , Stefano Ferretti , Moreno Marzolla

Heterogeneous many-cores are now an integral part of modern computing systems ranging from embedding systems to supercomputers. While heterogeneous many-core design offers the potential for energy-efficient high-performance, such potential…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-11 Jianbin Fang , Chun Huang , Tao Tang , Zheng Wang

With the advent of era of Big Data and Internet of Things, there has been an exponential increase in the availability of large data sets. These data sets require in-depth analysis that provides intelligence for improvements in methods for…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-07-11 Alvaro Tzul

We present a class of massively parallel processor architectures called invasive tightly coupled processor arrays (TCPAs). The presented processor class is a highly parameterizable template, which can be tailored before runtime to fulfill…

Hardware Architecture · Computer Science 2014-05-14 Vahid Lari , Alexandru Tanase , Frank Hannig , Jürgen Teich

With the continuously increasing integration level, manycore processor systems are likely to be the coming system structure not only in HPC but also for desktop or mobile systems. Nowadays manycore processors like Tilera TILE, KALRAY MPPA…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-05-14 Oliver Mattes , Wolfgang Karl

New algorithms and optimization techniques are needed to balance the accelerating trend towards bandwidth-starved multicore chips. It is well known that the performance of stencil codes can be improved by temporal blocking, lessening the…

Performance · Computer Science 2012-03-01 Markus Wittmann , Georg Hager , Gerhard Wellein

The processor accelerators are effective because they are working not (completely) on principles of stored program computers. They use some kind of parallelism, and it is rather hard to program them effectively: a parallel architecture by…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-26 János Végh

Next-generation wireless technologies (for immersive-massive communication, joint communication and sensing) demand highly parallel architectures for massive data processing. A common architectural template scales up by grouping tens to…

Hardware Architecture · Computer Science 2025-07-08 Samuel Riedel , Yichao Zhang , Marco Bertuletti , Luca Benini
‹ Prev 1 2 3 10 Next ›