Related papers: A High-Level Reconfigurable Computing Platform Sof…
Reconfigurable devices, such as Field Programmable Gate Arrays (FPGAs), have been witnessing a considerable increase in density. State-of-the-art FPGAs are complex hybrid devices that contain up to several millions of gates. Recently,…
Field Programmable Gate Arrays (FPGAs) plays an increasingly important role in data sampling and processing industries due to its highly parallel architecture, low power consumption, and flexibility in custom algorithms. Especially, in the…
This paper presents an open-source, lightweight, yet comprehensive software framework, named RPC, which integrates physics-based simulators, planning and control libraries, debugging tools, and a user-friendly operator interface. RPC…
Data movement is the dominating factor affecting performance and energy in modern computing systems. Consequently, many algorithms have been developed to minimize the number of I/O operations for common computing patterns. Matrix…
Optimization of product performance repetitively introduces the need to make products adaptive in a more general sense. This more general idea is often captured under the term 'self-configuration'. Despite the importance of such capability,…
This paper presents our approach for making FPGA accelerators accessible to software (SW) programmers. It is intended as a starting point for collaborations with other groups pursuing similar objectives. We report on our current SAccO…
Deep learning (DL) is becoming the cornerstone of numerous applications both in datacenters and at the edge. Specialized hardware is often necessary to meet the performance requirements of state-of-the-art DL models, but the rapid pace of…
This contribution presents a new approach for allocating suitable function-implementation variants depending on given quality-of-service function-requirements for run-time reconfigurable multi-device systems. Our approach adapts…
In this article, a new generic higher-order finite-element framework for massively parallel simulations is presented. The modular software architecture is carefully designed to exploit the resources of modern and future supercomputers.…
Parallel data processing has become indispensable for processing applications involving huge data sets. This brings into focus the Graphics Processing Units (GPUs) which emphasize on many-core computing. With the advent of General Purpose…
We introduce a logical framework for the specification and verification of component-based systems, in which finitely many component instances are active, but the bound on their number is not known. Besides specifying and verifying…
The almost unlimited possibilities to customize the logic in an FPGA are one of the main reasons for the versatility of these devices. Partial reconfiguration exploits this capability even further by allowing to replace logic in predefined…
Scientific computing is at the core of many High-Performance Computing applications, including computational flow dynamics. Because of the uttermost importance to simulate increasingly larger computational models, hardware acceleration is…
The IBM Neural Computer (INC) is a highly flexible, re-configurable parallel processing system that is intended as a research and development platform for emerging machine intelligence algorithms and computational neuroscience. It consists…
Massively parallel architectures offer the potential to significantly accelerate an application relative to their serial counterparts. However, not all applications exhibit an adequate level of data and/or task parallelism to exploit such…
Heterogeneity has become a mainstream architecture design choice for building High Performance Computing systems. However, heterogeneity poses significant challenges for achieving performance portability of execution. Adapting a program to…
Coarse-grained reconfigurable architectures aim to achieve both goals of high performance and flexibility. However, existing reconfigurable array architectures require many resources without considering the specific application domain.…
At the intersection between traditional CPU architectures and more specialized options such as FPGAs or ASICs lies the family of reconfigurable hardware architectures, termed Coarse-Grained Reconfigurable Arrays (CGRAs). CGRAs are composed…
Requirements and code, in conventional software engineering wisdom, belong to entirely different worlds. Is it possible to unify these two worlds? A unified framework could help make software easier to change and reuse. To explore the…
Heterogeneous many-cores are now an integral part of modern computing systems ranging from embedding systems to supercomputers. While heterogeneous many-core design offers the potential for energy-efficient high-performance, such potential…