Related papers: AutoDSE: Enabling Software Programmers to Design E…
In the domain of image processing, often real-time constraints are required. In particular, in safety-critical applications, such as X-ray computed tomography in medical imaging or advanced driver assistance systems in the automotive…
FPGA-based heterogeneous architectures provide programmers with the ability to customize their hardware accelerators for flexible acceleration of many workloads. Nonetheless, such advantages come at the cost of sacrificing programmability.…
In recent years, domain-specific accelerators (DSAs) have gained popularity for applications such as deep learning and autonomous driving. To facilitate DSA designs, programmers use high-level synthesis (HLS) to compile a high-level…
Designing field-programmable gate array (FPGA)-based accelerators for modern artificial intelligence workloads requires navigating a large and complex hardware design space encompassing architectural parameters, dataflow strategies, and…
CPU-FPGA heterogeneous architectures are attracting ever-increasing attention in an attempt to advance computational capabilities and energy efficiency in today's datacenters. These architectures provide programmers with the ability to…
Field-programmable gate arrays (FPGAs) provide an opportunity to co-design applications with hardware accelerators, yet they remain difficult to program. High-level synthesis (HLS) tools promise to raise the level of abstraction by…
The design of efficient hardware accelerators for high-throughput data-processing applications, e.g., deep neural networks, is a challenging task in computer architecture design. In this regard, High-Level Synthesis (HLS) emerges as a…
Recent years have witnessed the growing popularity of domain-specific accelerators (DSAs), such as Google's TPUs, for accelerating various applications such as deep learning, search, autonomous driving, etc. To facilitate DSA designs,…
Customized accelerators have revolutionized modern computing by delivering substantial gains in energy efficiency and performance through hardware specialization. Field-Programmable Gate Arrays (FPGAs) play a crucial role in this paradigm,…
Recently, the field of deep learning has received great attention by the scientific community and it is used to provide improved solutions to many computer vision problems. Convolutional neural networks (CNNs) have been successfully used to…
Although high-level synthesis (HLS) tools have significantly improved programmer productivity over hardware description languages, developing for FPGAs remains tedious and error prone. Programmers must learn and implement a large set of…
High Level Synthesis (HLS) tools, like the Intel FPGA SDK for OpenCL, improve design productivity and enable efficient design space exploration guided by simple program directives (pragmas), but may sometimes miss important optimizations…
In recent years, hardware accelerators based on field-programmable gate arrays (FPGAs) have been widely adopted, thanks to FPGAs' extraordinary flexibility. However, with the high flexibility comes the difficulty in design and optimization.…
Molecular dynamics (MD) simulation is one of the past decade's most important tools for enabling biology scientists and researchers to explore human health and diseases. However, due to the computation complexity of the MD algorithm, it…
High-Level Synthesis enables the rapid prototyping of hardware accelerators, by combining a high-level description of the functional behavior of a kernel with a set of micro-architecture optimizations as inputs. Such optimizations can be…
Design space exploration (DSE) is critical for developing optimized hardware architectures, especially for AI workloads such as deep neural networks (DNNs) and large language models (LLMs), which require specialized acceleration. As model…
Even though high-level synthesis (HLS) tools mitigate the challenges of programming domain-specific accelerators (DSAs) by raising the abstraction level, optimizing hardware directive parameters remains a significant hurdle. Existing…
We develop and commercialize autonomous machines, such as logistic robots and self-driving cars, around the globe. A critical challenge to our -- and any -- autonomous machine is accurate and efficient localization under resource…
Hardware accelerators, especially those designed for tensor processing, have become ubiquitous in today's computing landscape. However, even with significant efforts in building compilers, programming these tensor accelerators remains…
High-level synthesis (HLS) is a design flow that leverages modern language features and flexibility, such as complex data structures, inheritance, templates, etc., to prototype hardware designs rapidly. However, exploring various design…