English
Related papers

Related papers: Joint Hardware-Workload Co-Optimization for In-Mem…

200 papers

Designing generalized in-memory computing (IMC) hardware that efficiently supports a variety of workloads requires extensive design space exploration, which is infeasible to perform manually. Optimizing hardware individually for each…

Hardware Architecture · Computer Science 2025-02-04 Olga Krestinskaya , Mohammed E. Fouda , Ahmed Eltawil , Khaled N. Salama

Neural architectures and hardware accelerators have been two driving forces for the progress in deep learning. Previous works typically attempt to optimize hardware given a fixed model architecture or model architecture given fixed…

In-memory computing hardware accelerators allow more than 10x improvements in peak efficiency and performance for matrix-vector multiplications (MVM) compared to conventional digital designs. For this, they have gained great interest for…

Hardware Architecture · Computer Science 2024-09-19 Pouya Houshmand , Marian Verhelst

The need to efficiently execute different Deep Neural Networks (DNNs) on the same computing platform, coupled with the requirement for easy scalability, makes Multi-Chip Module (MCM)-based accelerators a preferred design choice. Such an…

Hardware Architecture · Computer Science 2024-08-26 Abhijit Das , Enrico Russo , Maurizio Palesi

An increasing number of applications are exploiting sampling-based algorithms for planning, optimization, and inference. The Markov Chain Monte Carlo (MCMC) algorithms form the computational backbone of this emerging branch of machine…

Machine Learning · Computer Science 2025-07-18 Shirui Zhao , Jun Yin , Lingyun Yao , Martin Andraud , Wannes Meert , Marian Verhelst

In recent years, various computing-in-memory (CIM) processors have been presented, showing superior performance over traditional architectures. To unleash the potential of various CIM architectures, such as device precision, crossbar size,…

Hardware Architecture · Computer Science 2024-05-09 Songyun Qu , Shixin Zhao , Bing Li , Yintao He , Xuyi Cai , Lei Zhang , Ying Wang

The use of deep learning has grown at an exponential rate, giving rise to numerous specialized hardware and software systems for deep learning. Because the design space of deep learning software stacks and hardware accelerators is diverse…

Machine Learning · Computer Science 2020-10-06 Zhan Shi , Chirag Sakhuja , Milad Hashemi , Kevin Swersky , Calvin Lin

In view of the performance limitations of fully-decoupled designs for neural architectures and accelerators, hardware-software co-design has been emerging to fully reap the benefits of flexible design spaces and optimize neural network…

Hardware Architecture · Computer Science 2022-03-29 Bingqian Lu , Zheyu Yan , Yiyu Shi , Shaolei Ren

Deployment of dynamic neural networks on edge accelerators requires careful consideration of hardware constraints beyond conventional complexity metrics such as Multiply-Accumulate operations. In Early-Exiting Neural Networks (EENN), exit…

Computational Complexity · Computer Science 2026-04-01 Alaa Zniber , Arne Symons , Ouassim Karrakchou , Marian Verhelst , Mounir Ghogho

Modular end-to-end (ME2E) autonomous driving paradigms combine modular interpretability with global optimization capability and have demonstrated strong performance. However, existing studies mainly focus on accuracy improvement, while…

Artificial Intelligence · Computer Science 2026-01-13 Chengzhi Ji , Xingfeng Li , Zhaodong Lv , Hao Sun , Pan Liu , Hao Frank Yang , Ziyuan Pu

Optimizing the quality of result (QoR) and the quality of service (QoS) of AI-empowered autonomous systems simultaneously is very challenging. First, there are multiple input sources, e.g., multi-modal data from different sensors, requiring…

Artificial Intelligence · Computer Science 2021-04-12 Cong Hao , Deming Chen

Solid-state storage architectures based on NAND or emerging memory devices (SSD), are fundamentally architected and optimized for both reliability and performance. Achieving these simultaneous goals requires co-design of memory components…

Hardware Architecture · Computer Science 2026-03-20 Jay Sarkar , Vamsi Pavan Rayaprolu , Abhijeet Bhalerao

Spiking Neural Networks (SNNs) are bio-plausible models that hold great potential for realizing energy-efficient implementations of sequential tasks on resource-constrained edge devices. However, commercial edge platforms based on standard…

Neural and Evolutionary Computing · Computer Science 2023-09-26 Marco Paul E. Apolinario , Adarsh Kumar Kosta , Utkarsh Saxena , Kaushik Roy

Recent research efforts focus on reducing the computational and memory overheads of Large Language Models (LLMs) to make them feasible on resource-constrained devices. Despite advancements in compression techniques, non-linear operators…

Hardware Architecture · Computer Science 2024-11-28 Mariam Rakka , Jinhao Li , Guohao Dai , Ahmed Eltawil , Mohammed E. Fouda , Fadi Kurdahi

Hyperdimensional computing (HDC), utilizing a parallel computing paradigm and efficient learning algorithm, is well-suited for resource-constrained artificial intelligence (AI) applications, such as in edge devices. In-memory computing…

Emerging Technologies · Computer Science 2025-12-25 Yi Huang , Alireza Jaberi Rad , Qiangfei Xia

To maximize hardware efficiency and performance accuracy in Compute-In-Memory (CIM)-based neural network accelerators for Artificial Intelligence (AI) applications, co-optimizing both software and hardware design parameters is essential.…

Artificial Intelligence · Computer Science 2025-10-01 Olga Krestinskaya , Mohammed E. Fouda , Ahmed Eltawil , Khaled N. Salama

Emerging multi-model workloads with heavy models like recent large language models significantly increased the compute and memory demands on hardware. To address such increasing demands, designing a scalable hardware architecture became a…

Hardware Architecture · Computer Science 2024-09-17 Mohanad Odema , Luke Chen , Hyoukjun Kwon , Mohammad Abdullah Al Faruque

The rapid deployment of machine learning across platforms from milliwatt-class TinyML devices to large language models has made energy efficiency a primary constraint for sustainable AI. Across these scales, performance and energy are…

Hardware Architecture · Computer Science 2026-03-26 Mohammad Saleh Vahdatpour , Yanqing Zhang

In recent years, processing in memory (PIM) based mixedsignal designs have been proposed as energy- and area-efficient solutions with ultra high throughput to accelerate DNN computations. However, PIM designs are sensitive to imperfections…

Hardware Architecture · Computer Science 2022-08-31 Payman Behnam , Uday Kamal , Saibal Mukhopadhyay

Increasing AI computing demands and slowing transistor scaling have led to the advent of Multi-Chip-Module (MCMs) based accelerators. MCMs enable cost-effective scalability, higher yield, and modular reuse by partitioning large chips into…

Hardware Architecture · Computer Science 2025-05-06 Ritik Raj , Shengjie Lin , William Won , Tushar Krishna
‹ Prev 1 2 3 10 Next ›