English
Related papers

Related papers: A Data-Driven Dynamic Execution Orchestration Arch…

200 papers

The research interest in specialized hardware accelerators for deep neural networks (DNN) spikes recently owing to their superior performance and efficiency. However, today's DNN accelerators primarily focus on accelerating specific…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-11 Cong Guo , Yangjie Zhou , Jingwen Leng , Yuhao Zhu , Zidong Du , Quan Chen , Chao Li , Bin Yao , Minyi Guo

Efficient and real time segmentation of color images has a variety of importance in many fields of computer vision such as image compression, medical imaging, mapping and autonomous navigation. Being one of the most computationally…

Computer Vision and Pattern Recognition · Computer Science 2017-10-09 Roopal Nahar , Akanksha Baranwal , K. Madhava Krishna

Applications with irregular data structures, data-dependent control flows and fine-grained data transfers (e.g., real-world graph computations) perform poorly on cache-based systems. We propose the UpDown accelerator that supports…

Deep neural network (DNN) inference relies increasingly on specialized hardware for high computational efficiency. This work introduces a field-programmable gate array (FPGA)-based dynamically configurable accelerator featuring systolic…

Hardware Architecture · Computer Science 2025-10-10 Anastasios Petropoulos , Theodore Antonakopoulos

As quantum computers continue to improve and support larger, more complex computations, smart control hardware and compilers are needed to efficiently leverage the capabilities of these systems. This paper introduces a novel approach to…

Quantum Physics · Physics 2025-11-19 Folkert de Ronde , Alexander Knapen , Stephan Wong , Sebastian Feld

This study presents a novel computer architecture where a last level cache and a SIMD accelerator are replaced by an Associative Processor. Associative Processor combines data storage and data processing and provides parallel computational…

Hardware Architecture · Computer Science 2013-11-11 Leonid Yavits , Amir Morad , Ran Ginosar

Control parallelism and data parallelism is mostly reasoned and optimized as separate functions. Because of this, workloads that are irregular, fine-grain and dynamic such as dynamic graph processing become very hard to scale. An…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-03-08 Bibrak Qamar Chandio , Thomas Sterling , Prateek Srivastava

Traditional heterogeneous parallel algorithms, designed for heterogeneous clusters of workstations, are based on the assumption that the absolute speed of the processors does not depend on the size of the computational task. This assumption…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-09-15 Alexey Lastovetsky , Ravi Reddy , Vladimir Rychkov , David Clarke

FPGAs are well-suited for dataflow architectures that process data in a streaming or pipelined manner, thus satisfying the high computational and communication demands of emerging applications. However, manually implementing an efficient…

Hardware Architecture · Computer Science 2026-04-15 Weichuang Zhang , Yiquan Wang , Xinzhou Zhang , Chi Zhang , Yu Feng , Xiaofeng Hou , Chao Li , Jieru Zhao , Minyi Guo

With the growing complexity and capability of contemporary robotic systems, the necessity of sophisticated computing solutions to efficiently handle tasks such as real-time processing, sensor integration, decision-making, and control…

Robotics · Computer Science 2025-09-09 Md Rafid Islam

The processor accelerators are effective because they are working not (completely) on principles of stored program computers. They use some kind of parallelism, and it is rather hard to program them effectively: a parallel architecture by…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-26 János Végh

Numerical simulations can help solve complex problems. Most of these algorithms are massively parallel and thus good candidates for FPGA acceleration thanks to spatial parallelism. Modern FPGA devices can leverage high-bandwidth memory…

Hardware Architecture · Computer Science 2022-11-09 Stephanie Soldavini , Karl F. A. Friebel , Mattia Tibaldi , Gerald Hempel , Jeronimo Castrillon , Christian Pilato

Convolutional Neural Networks (CNNs) are widely used in deep learning applications, e.g. visual systems, robotics etc. However, existing software solutions are not efficient. Therefore, many hardware accelerators have been proposed…

Machine Learning · Computer Science 2021-09-08 Sasindu Wijeratne , Sandaruwan Jayaweera , Mahesh Dananjaya , Ajith Pasqual

In this paper, we introduce a software-defined framework that enables the parallel utilization of all the programmable processing resources available in heterogeneous system-on-chip (SoC) including FPGA-based hardware accelerators and…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-12 Jose Nunez-Yanez , Mohammad Hosseinabady , Moslem Amiri , Andrés Rodríguez , Rafael Asenjo , Angeles Navarro , Rubén Gran-Tejero , Darío Suárez-Gracia

The enhanced efficiency of hardware accelerators, including Single Instruction Multiple Data (SIMD) architectures and Coarse-Grained Reconfigurable Architectures (CGRAs), is driving significant advancements in Artificial Intelligence and…

Hardware Architecture · Computer Science 2025-04-29 Yu Yang , Jordi Altayó González , Paul Delestrac , Ahmed Hemani

Increasing investment in computing technologies and the advancements in silicon technology has fueled rapid growth in advanced driver assistance systems (ADAS) and corresponding SoC developments. An ADAS SoC represents a heterogeneous…

Hardware Architecture · Computer Science 2022-09-14 Hao Luan , Yu Yao , Chang Huang

Computer systems that have been successfully deployed for dense regular workloads fall short of achieving scalability and efficiency when applied to irregular and dynamic graph applications. Conventional computing systems rely heavily on…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-24 Bibrak Qamar Chandio , Maciej Brodowicz , Thomas Sterling

The data science community today has embraced the concept of Dataframes as the de facto standard for data representation and manipulation. Ease of use, massive operator coverage, and popularization of R and Python languages have heavily…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-07-06 Niranda Perera , Supun Kamburugamuve , Chathura Widanage , Vibhatha Abeykoon , Ahmet Uyar , Kaiying Shan , Hasara Maithree , Damitha Lenadora , Thejaka Amila Kanewala , Geoffrey Fox

Deep neural networks (DNNs) have been shown to outperform conventional machine learning algorithms across a wide range of applications, e.g., image recognition, object detection, robotics, and natural language processing. However, the high…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-23 Ye Yu , Yingmin Li , Shuai Che , Niraj K. Jha , Weifeng Zhang

The current state of the art of Simultaneous Localisation and Mapping, or SLAM, on low power embedded systems is about sparse localisation and mapping with low resolution results in the name of efficiency. Meanwhile, research in this field…

Robotics · Computer Science 2019-02-14 Konstantinos Boikos , Christos-Savvas Bouganis
‹ Prev 1 2 3 10 Next ›