English
Related papers

Related papers: ARENA: Asynchronous Reconfigurable Accelerator Rin…

200 papers

Domain-specific accelerators are used in various computing systems ranging from edge devices to data centers. Coarse-grained reconfigurable arrays (CGRAs) represent an architectural midpoint between the flexibility of an FPGA and the…

Hardware Architecture · Computer Science 2023-01-04 Taeyoung Kong , Kalhan Koul , Priyanka Raina , Mark Horowitz , Christopher Torng

Reconfigurable computing offers a good balance between flexibility and energy efficiency. When combined with software-programmable devices such as CPUs, it is possible to obtain higher performance by spatially distributing the…

Hardware Architecture · Computer Science 2024-04-22 Daniel Vazquez , Jose Miranda , Alfonso Rodriguez , Andres Otero , Pascuale Davide Schiavone , David Atienza

With the emerging big data applications of Machine Learning, Speech Recognition, Artificial Intelligence, and DNA Sequencing in recent years, computer architecture research communities are facing the explosive scale of various data…

Hardware Architecture · Computer Science 2017-12-14 Chao Wang , Wenqi Lou , Lei Gong , Lihui Jin , Luchao Tan , Yahui Hu , Xi Li , Xuehai Zhou

Efficiently training large-scale models (LMs) in GPU clusters involves two separate avenues: inter-job dynamic scheduling and intra-job adaptive parallelism (AP). However, existing dynamic schedulers struggle with large-model scheduling due…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-25 Chunyu Xue , Weihao Cui , Quan Chen , Chen Chen , Han Zhao , Shulai Zhang , Linmei Wang , Yan Li , Limin Xiao , Weifeng Zhang , Jing Yang , Bingsheng He , Minyi Guo

Coarse-Grained Reconfigurable Arrays (CGRA) are promising edge accelerators due to the outstanding balance in flexibility, performance, and energy efficiency. Classic CGRAs statically map compute operations onto the processing elements (PE)…

Hardware Architecture · Computer Science 2023-09-20 Dan Wu , Peng Chen , Thilini Kaushalya Bandara , Zhaoying Li , Tulika Mitra

Coarse-Grained Reconfigurable Arrays (CGRAs) are specialized accelerators commonly employed to boost performance in workloads with iterative structures. Existing research typically focuses on compiler or architecture optimizations aimed at…

Hardware Architecture · Computer Science 2025-08-28 Xiangfeng Liu , Zhe Jiang , Anzhen Zhu , Xiaomeng Han , Mingsong Lyu , Qingxu Deng , Nan Guan

Coarse-Grained Reconfigurable Architectures (CGRAs) are a promising and versatile accelerator platform, offering a balance between the performance and efficiency of specialized accelerators and the software programmability. However, their…

Programming Languages · Computer Science 2026-04-07 Shangkun Li , Jinming Ge , Diyuan Tao , Zeyu Li , Jiawei Liang , Linfeng Du , Jiang Xu , Wei Zhang , Cheng Tan

Deep neural networks (DNNs) offer plenty of challenges in executing efficient computation at edge nodes, primarily due to the huge hardware resource demands. The article proposes HYDRA, hybrid data multiplexing, and runtime layer…

Hardware Architecture · Computer Science 2026-03-31 Sonu Kumar , Komal Gupta , Gopal Raut , Mukul Lokhande , Santosh Kumar Vishvakarma

Convolutional Neural Networks (CNNs) are widely used in deep learning applications, e.g. visual systems, robotics etc. However, existing software solutions are not efficient. Therefore, many hardware accelerators have been proposed…

Machine Learning · Computer Science 2021-09-08 Sasindu Wijeratne , Sandaruwan Jayaweera , Mahesh Dananjaya , Ajith Pasqual

At the intersection between traditional CPU architectures and more specialized options such as FPGAs or ASICs lies the family of reconfigurable hardware architectures, termed Coarse-Grained Reconfigurable Arrays (CGRAs). CGRAs are composed…

Hardware Architecture · Computer Science 2025-09-05 Maxime Henri Aspros , Juan Sapriza , Giovanni Ansaloni , David Atienza

AI acceleration has been dominated by GPUs, but the growing need for lower latency, energy efficiency, and fine-grained hardware control exposes the limits of fixed architectures. In this context, Field-Programmable Gate Arrays (FPGAs)…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-18 Arturo Urías Jiménez

Coarse-grain reconfigurable architectures (CGRAs) are gaining traction thanks to their performance and power efficiency. Utilizing CGRAs to accelerate the execution of tight loops holds great potential for achieving significant overall…

Hardware Architecture · Computer Science 2024-05-28 Elad Hadar , Yoav Etsion

Modern computing workloads, particularly in AI and edge applications, demand hardware-software co-design to meet aggressive performance and energy targets. Such co-design benefits from open and agile platforms that replace closed,…

Hardware Architecture · Computer Science 2025-08-27 Rohan Juneja , Pranav Dangi , Thilini Kaushalya Bandara , Zhaoying Li , Dhananjaya Wijerathne , Li-Shiuan Peh , Tulika Mitra

Convolutional Neural Networks (CNNs) have proven to be extremely accurate for image recognition, even outperforming human recognition capability. When deployed on battery-powered mobile devices, efficient computer architectures are required…

Hardware Architecture · Computer Science 2020-10-05 Mehdi Ahmadi , Shervin Vakili , J. M. Pierre Langlois

With the end of both Dennard's scaling and Moore's law, computer users and researchers are aggressively exploring alternative forms of computing in order to continue the performance scaling that we have come to enjoy. Among the more salient…

Hardware Architecture · Computer Science 2020-09-16 Artur Podobas , Kentaro Sano , Satoshi Matsuoka

Convolutional Neural Networks (CNNs) are fundamental to deep learning, driving applications across various domains. However, their growing complexity has significantly increased computational demands, necessitating efficient hardware…

Machine Learning · Computer Science 2025-05-21 Junye Jiang , Yaan Zhou , Yuanhao Gong , Haoxuan Yuan , Shuanglong Liu

With increasing diversity in Deep Neural Network(DNN) models in terms of layer shapes and sizes, the research community has been investigating flexible/reconfigurable accelerator substrates. This line of research has opened up two…

Hardware Architecture · Computer Science 2022-04-26 Ananda Samajdar , Michael Pellauer , Tushar Krishna

Federated learning (FL) enables collaborative model training among distributed devices without data sharing, but existing FL suffers from poor scalability because of global model synchronization. To address this issue, hierarchical…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-22 Tianyu Qi , Yufeng Zhan , Peng Li , Jingcai Guo , Yuanqing Xia

Transformers have revolutionized deep learning with applications in natural language processing, computer vision, and beyond. However, their computational demands make it challenging to deploy them on low-power edge devices. This paper…

Hardware Architecture · Computer Science 2025-07-18 Rohit Prasad

While there have been many studies on hardware acceleration for deep learning on images, there has been a rather limited focus on accelerating deep learning applications involving graphs. The unique characteristics of graphs, such as the…

Machine Learning · Computer Science 2021-11-12 Atefeh Sohrabizadeh , Yuze Chi , Jason Cong
‹ Prev 1 2 3 10 Next ›