Related papers: Designing Hardware/Software Systems for Embedded H…
In this work, we propose a configurable many-core overlay for high-performance embedded computing. The size of internal memory, supported operations and number of ports can be configured independently for each core of the overlay. The…
Today, there is a trend to incorporate more intelligence (e.g., vision capabilities) into a wide range of devices, which makes high performance a necessity for computing systems. Furthermore, for embedded systems, low power consumption…
The edge computing paradigm has emerged to handle cloud computing issues such as scalability, security and low response time among others. This new computing trend heavily relies on ubiquitous embedded systems on the edge. Performance and…
Due to the emergence of embedded applications in image and video processing, communication and cryptography, improvement of pictorial information for better human perception like deblurring, denoising in several fields such as satellite…
Field Programmable Gate Arrays (FPGAs) have recently been increasingly used for highly-parallel processing of compute intensive tasks. This paper introduces an FPGA hardware platform architecture that is PC-based, allows for fast…
When designing modern embedded computing systems, most software programmers choose to use multicore processors, possibly in combination with general-purpose graphics processing units (GPGPUs) and/or hardware accelerators. They also often…
In order for FPGAs to be successful outside traditional markets, tools which enable software programmers to achieve high levels of system performance while abstracting away the FPGA-specific details are needed. DSPB Builder Advanced (DSPBA)…
Embedded Systems combine one or more processor cores with dedicated logic running on an ASIC or FPGA to meet design goals at reasonable cost. It is achieved by profiling the application with variety of aspects like performance, memory…
We present a customizable soft architecture which allows for the execution of GPGPU code on an FPGA without the need to recompile the design. Issues related to scaling the overlay architecture to multiple GPGPU multiprocessors are…
FPGA is appropriate for fix-point neural networks computing due to high power efficiency and configurability. However, its design must be intensively refined to achieve high performance using limited hardware resources. We present an…
Edge computing devices inherently face tight resource constraints, which is especially apparent when deploying Deep Neural Networks (DNN) with high memory and compute demands. FPGAs are commonly available in edge devices. Since these…
We present a tool flow and results for a model-based hardware design for FPGAs from Simulink descriptions which nicely integrates into existing environments. While current commercial tools do not exploit some high-level optimizations, we…
Heterogeneous computing can potentially offer significant performance and performance per watt improvements over homogeneous computing, but the question "what is the ideal mapping of algorithms to architectures?" remains an open one. In the…
Data movement is the dominating factor affecting performance and energy in modern computing systems. Consequently, many algorithms have been developed to minimize the number of I/O operations for common computing patterns. Matrix…
Reconfigurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations in several domains because of their unique combination of flexibility, performance, and power efficiency. However, FPGAs…
This paper introduces a computer architecture, where part of the instruction set architecture (ISA) is implemented on small highly-integrated field-programmable gate arrays (FPGAs). Small FPGAs inside a general-purpose processor (CPU) can…
When trained as generative models, Deep Learning algorithms have shown exceptional performance on tasks involving high dimensional data such as image denoising and super-resolution. In an increasingly connected world dominated by mobile and…
High Performance Computing (HPC) platforms allow scientists to model computationally intensive algorithms. HPC clusters increasingly use General-Purpose Graphics Processing Units (GPGPUs) as accelerators; FPGAs provide an attractive…
Heterogeneous computing is emerging as a mandatory requirement for power-efficient system design. With this aim, modern heterogeneous platforms like Zynq All-Programmable SoC, that integrates ARM-based SMP and programmable logic, have been…
The use of high-level languages for designing hardware is gaining popularity since they increase design productivity by providing higher abstractions. However, one drawback of such abstraction level has been the difficulty of relating the…