English
Related papers

Related papers: FPGA-based Multi-Chip Module for High-Performance …

200 papers

Exascale systems are predicted to have approximately one billion cores, assuming Gigahertz cores. Limitations on affordable network topologies for distributed memory systems of such massive scale bring new challenges to the current parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-05-27 Huda Ibeid , Rio Yokota , David Keyes

We present and evaluate the ExaNeSt Prototype, a liquid-cooled rack prototype consisting of 256 Xilinx ZU9EG MPSoCs, 4 TBytes of DRAM, 16 TBytes of SSD, and configurable interconnection 10-Gbps hardware. We developed this testbed in…

Among the algorithms that are likely to play a major role in future exascale computing, the fast multipole method (FMM) appears as a rising star. Our previous recent work showed scaling of an FMM on GPU clusters, with problem sizes in the…

Numerical Analysis · Computer Science 2012-10-30 Rio Yokota , Lorena Barba

The deployment of the next generation computing platform at ExaFlops scale requires to solve new technological challenges mainly related to the impressive number (up to 10^6) of compute elements required. This impacts on system power…

In this work, we propose a configurable many-core overlay for high-performance embedded computing. The size of internal memory, supported operations and number of ports can be configured independently for each core of the overlay. The…

Hardware Architecture · Computer Science 2014-08-25 Mário Véstias , Horácio Neto

Hybrid memory systems, comprised of emerging non-volatile memory (NVM) and DRAM, have been proposed to address the growing memory demand of applications. Emerging NVM technologies, such as phase-change memories (PCM), memristor, and 3D…

Hardware Architecture · Computer Science 2024-03-19 Fei Wen , Mian Qin , Paul V. Gratz , A. L. Narasimha Reddy

FFT, FMM, and multigrid methods are widely used fast and highly scalable solvers for elliptic PDEs. However, emerging large-scale computing systems are introducing challenges in comparison to current petascale computers. Recent efforts…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-31 Huda Ibeid , Luke Olson , William Gropp

In this work, we propose an architecture and methodology to design hardware/software systems for high-performance embedded computing on FPGA. The hardware side is based on a many-core architecture whose design is generated automatically…

Hardware Architecture · Computer Science 2015-08-28 Mário P. Véstias , Rui Policarpo Duarte , Horácio C. Neto

Emulating chip functionality before silicon production is crucial, especially with the increasing prevalence of RISC-V-based designs. FPGAs are promising candidates for such purposes due to their high-speed and reconfigurable architecture.…

Hardware Architecture · Computer Science 2024-02-06 Elias Perdomo , Alexander Kropotov , Francelly Cano , Syed Zafar , Teresa Cervero , Xavier Martorell , Behzad Salami

This paper presents SynapticCore-X, a modular and resource-efficient neural processing architecture optimized for deployment on low-cost FPGA platforms. The design integrates a lightweight RV32IMC RISC-V control core with a configurable…

Hardware Architecture · Computer Science 2025-11-18 Arya Parameshwara

Particle-in-Cell (PIC) Monte Carlo (MC) simulations are central to plasma physics but face increasing challenges on heterogeneous HPC systems due to excessive data movement, synchronization overheads, and inefficient utilization of multiple…

This work presents the design and preliminary performance of a highly-multiplexed superconducting detector readout. The readout system is implemented on the Xilinx ZCU111 RFSoC Evaluation Board. The current design uses 12% of the DSPs, 60%…

Instrumentation and Methods for Astrophysics · Physics 2022-03-31 Jennifer Pearl Smith , John I. Bailey , III. , Benjamin A. Mazin

In recent years the computational capacity of single Field Programmable Gate Arrays (FPGA) devices as well as their versatility has increased significantly. Adding to that the High Level Synthesis frameworks allowing to program such…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-22 G. Korcyl , P. Korcyl

The Xilinx ZCU111 Radio Frequency System on Chip (RFSoC) is a promising solution for reading out large arrays of microwave kinetic inductance detectors (MKIDs). The board boasts eight on-chip 12-bit / 4.096 GSPS analogue-to-digital…

Instrumentation and Methods for Astrophysics · Physics 2023-01-24 Eoin Baldwin , Mario De Lucia , Colm Bracken , Gerhard Ulbricht , Oisin Creaner , Jack Piercy , Tom Ray

This work arises on the environment of the ExaNeSt project aiming at design and development of an exascale ready supercomputer with low energy consumption profile but able to support the most demanding scientific and technical applications.…

Instrumentation and Methods for Astrophysics · Physics 2019-11-01 David Goz , Sara Bertocco , Luca Tornatore , Giuliano Taffoni

FPGA-based hardware accelerators have received increasing attention mainly due to their ability to accelerate deep pipelined applications, thus resulting in higher computational performance and energy efficiency. Nevertheless, the amount of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-23 R. Nepomuceno , R. Sterle , G. Valarini , M. Pereira , H. Yviquel , G. Araujo

Low bit-width Quantized Neural Networks (QNNs) enable deployment of complex machine learning models on constrained devices such as microcontrollers (MCUs) by reducing their memory footprint. Fine-grained asymmetric quantization (i.e.,…

Hardware Architecture · Computer Science 2020-10-09 Gianmarco Ottavi , Angelo Garofalo , Giuseppe Tagliavini , Francesco Conti , Luca Benini , Davide Rossi

While FPGA accelerator boards and their respective high-level design tools are maturing, there is still a lack of multi-FPGA applications, libraries, and not least, benchmarks and reference implementations towards sustained HPC usage of…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-01 Marius Meyer , Tobias Kenter , Christian Plessl

We propose advancing photonic in-memory computing through three-dimensional photonic-electronic integrated circuits using phase-change materials (PCM) and AlGaAs-CMOS technology. These circuits offer high precision (greater than 12 bits),…

Autonomous robots require efficient on-device learning to adapt to new environments without cloud dependency. For this edge training, Microscaling (MX) data types offer a promising solution by combining integer and floating-point…

Hardware Architecture · Computer Science 2025-12-16 Stef Cuyckens , Xiaoling Yi , Nitish Satya Murthy , Chao Fang , Marian Verhelst
‹ Prev 1 2 3 10 Next ›