Related papers: Enabling OpenMP Task Parallelism on Multi-FPGAs
While FPGA accelerator boards and their respective high-level design tools are maturing, there is still a lack of multi-FPGA applications, libraries, and not least, benchmarks and reference implementations towards sustained HPC usage of…
In this work we evaluate the potential of FPGAs for accelerating HPC workloads as a more power-efficient alternative to GPUs. Using High-Level Synthesis and a large set of optimization techniques, we show that FPGAs can achieve better…
Modern multicore systems are migrating from homogeneous systems to heterogeneous systems with accelerator-based computing in order to overcome the barriers of performance and power walls. In this trend, FPGA-based accelerators are becoming…
In this paper, we introduce a software-defined framework that enables the parallel utilization of all the programmable processing resources available in heterogeneous system-on-chip (SoC) including FPGA-based hardware accelerators and…
FPGA overlays are commonly implemented as coarse-grained reconfigurable architectures with a goal to improve designers' productivity through balancing flexibility and ease of configuration of the underlying fabric. To truly facilitate full…
FPGA is appropriate for fix-point neural networks computing due to high power efficiency and configurability. However, its design must be intensively refined to achieve high performance using limited hardware resources. We present an…
When designing modern embedded computing systems, most software programmers choose to use multicore processors, possibly in combination with general-purpose graphics processing units (GPGPUs) and/or hardware accelerators. They also often…
FPGAs are an attractive type of accelerator for all-purpose HPC computing systems due to the possibility of deploying tailored hardware on demand. However, the common tools for programming and operating FPGAs are still complex to use,…
FPGAs are increasingly prevalent in cloud deployments, serving as Smart NICs or network-attached accelerators. Despite their potential, developing distributed FPGA-accelerated applications remains cumbersome due to the lack of appropriate…
FPGAs have found increasing adoption in data center applications since a new generation of high-level tools have become available which noticeably reduce development time for FPGA accelerators and still provide high quality of results.…
Hardware-based acceleration is an extensive attempt to facilitate many computationally-intensive mathematics operations. This paper proposes an FPGA-based architecture to accelerate the convolution operation - a complex and expensive…
The rapid growth of Internet-of-things (IoT) and artificial intelligence applications have called forth a new computing paradigm--edge computing. In this paper, we study the suitability of deploying FPGAs for edge computing from the…
One of the barriers to the adoption of parallel computing is the inherent complexity of its programming. The Open Multi-Processing (OpenMP) Application Programming Interface (API) facilitates such implementations, providing high abstraction…
A variety of computing platform like Field Programmable Gate Array (FPGA), Graphics Processing Unit (GPU) and multicore Central Processing Unit (CPU) in data centers are suitable for acceleration of data-intensive workloads. Especially,…
The continuous growth of big data applications with high computational and scalability demands has resulted in increasing popularity of cloud computing. Optimizing the performance and power consumption of cloud resources is therefore…
Cloud deployments now increasingly provision FPGA accelerators as part of virtual instances. While FPGAs are still essentially single-tenant, the growing demand for hardware acceleration will inevitably lead to the need for methods and…
The aim of parallel computing is to increase an application performance by executing the application on multiple processors. OpenMP is an API that supports multi platform shared memory programming model and shared-memory programs are…
High Performance Computing (HPC) platforms allow scientists to model computationally intensive algorithms. HPC clusters increasingly use General-Purpose Graphics Processing Units (GPGPUs) as accelerators; FPGAs provide an attractive…
In the field of High Performance Computing, communications among processes represent a typical bottleneck for massively parallel scientific applications. Object of this research is the development of a network interface card with specific…
FPGAs (Field Programmable Gate arrays) have gained massive popularity today as accelerators for a variety of workloads, including big data analytics, and parallel and distributed computing. This has fueled the study of mechanisms to…