English
Related papers

Related papers: Software-Hardware Codesign for Efficient In-Memory…

200 papers

We present a hardware architecture that uses the Neural Engineering Framework (NEF) to implement large-scale neural networks on Field Programmable Gate Arrays (FPGAs) for performing pattern recognition in real time. NEF is a framework that…

Neural and Evolutionary Computing · Computer Science 2015-07-22 Runchun Wang , Chetan Singh Thakur , Tara Julia Hamilton , Jonathan Tapson , Andre van Schaik

Neural networks are increasingly used in real-time systems, such as automated driving applications. This requires high-performance hardware with predictable timing behavior. State-of-the-art real-time hardware is limited in memory and…

Hardware Architecture · Computer Science 2024-10-15 Maximilian Kirschner , Konstantin Dudzik , Jürgen Becker

Traditional Von Neumann computing is falling apart in the era of exploding data volumes as the overhead of data transfer becomes forbidding. Instead, it is more energy-efficient to fuse compute capability with memory where the data reside.…

Deep neural networks generate and process large volumes of data, posing challenges for low-resource embedded systems. In-memory computing has been demonstrated as an efficient computing infrastructure and shows promise for embedded AI…

Emerging Technologies · Computer Science 2025-07-03 Benjamin Chen Ming Choong , Tao Luo , Cheng Liu , Bingsheng He , Wei Zhang , Joey Tianyi Zhou

This paper presents a programmable in-memory-computing processor, demonstrated in a 65nm CMOS technology. For data-centric workloads, such as deep neural networks, data movement often dominates when implemented with today's computing…

Hardware Architecture · Computer Science 2020-09-17 Hongyang Jia , Yinqi Tang , Hossein Valavi , Jintao Zhang , Naveen Verma

This paper highlights new opportunities for designing large-scale machine learning systems as a consequence of blurring traditional boundaries that have allowed algorithm designers and application-level practitioners to stay -- for the most…

Machine Learning · Computer Science 2014-09-10 Suyog Gupta , Vikas Sindhwani , Kailash Gopalakrishnan

Kernel approximation is widely used to scale up kernel SVM training and prediction. However, the memory and computation costs of kernel approximation models are still too high if we want to deploy them on memory-limited devices such as…

Machine Learning · Computer Science 2020-10-07 Zijian Lei , Liang Lan

Regular expression matching is essential for many applications, such as finding patterns in text, exploring substrings in large DNA sequences, or lexical analysis. However, sequential regular expression matching may be time-prohibitive for…

Formal Languages and Automata Theory · Computer Science 2015-06-30 Suejb Memeti , Sabri Pllana

Neural architectures and hardware accelerators have been two driving forces for the progress in deep learning. Previous works typically attempt to optimize hardware given a fixed model architecture or model architecture given fixed…

As state of the art neural networks (NNs) continue to grow in size, their resource-efficient implementation becomes ever more important. In this paper, we introduce a compression scheme that reduces the number of computations required for…

Machine Learning · Computer Science 2025-04-25 Hans Rosenberger , Rodrigo Fischer , Johanna S. Fröhlich , Ali Bereyhi , Ralf R. Müller

In the recent past, the success of Neural Architecture Search (NAS) has enabled researchers to broadly explore the design space using learning-based methods. Apart from finding better neural network architectures, the idea of automation has…

Machine Learning · Computer Science 2019-11-04 Qing Lu , Weiwen Jiang , Xiaowei Xu , Yiyu Shi , Jingtong Hu

Weak-memory models are standard formal specifications of concurrency across hardware, programming languages, and distributed systems. A fundamental computational problem is consistency testing: is the observed execution of a concurrent…

Programming Languages · Computer Science 2023-11-16 Soham Chakraborty , Shankaranarayanan Krishna , Umang Mathur , Andreas Pavlogiannis

Computational complexity poses a significant challenge in wireless communication. Most existing attempts aim to reduce it through algorithm-specific approaches. However, the precision of computing, which directly relates to both computing…

Signal Processing · Electrical Eng. & Systems 2025-09-01 Kaixuan Bao , Wei Xu , Xiaohu You , Derrick Wing Kwan Ng

We present a novel architecture for sparse pattern processing, using flash storage with embedded accelerators. Sparse pattern processing on large data sets is the essence of applications such as document search, natural language processing,…

Hardware Architecture · Computer Science 2017-01-25 Sang-Woo Jun , Huy T. Nguyen , Vijay N. Gadepally , Arvind

Parallel programmers face the often irreconcilable goals of programmability and performance. HPC systems use distributed memory for scalability, thereby sacrificing the programmability advantages of shared memory programming models.…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-01-21 Bharath Ramesh , Calvin J. Ribbens , Srinidhi Varadarajan

Hardware specialization is becoming a key enabler of energyefficient performance. Future systems will be increasingly heterogeneous, integrating multiple specialized and programmable accelerators, each with different memory demands.…

Hardware Architecture · Computer Science 2021-04-26 Johnathan Alsop , Weon Taek Na , Matthew D. Sinclair , Samuel Grayson , Sarita V. Adve

While general-purpose computing follows Von Neumann's architecture, the data movement between memory and processor elements dictates the processor's performance. The evolving compute-in-memory (CiM) paradigm tackles this issue by…

Hardware Architecture · Computer Science 2024-11-15 Dhandeep Challagundla , Ignatius Bezzam , Riadul Islam

Recent hardware acceleration advances have enabled powerful specialized accelerators for finite element computations, spiking neural network inference, and sparse tensor operations. However, existing approaches face fundamental limitations:…

Hardware Architecture · Computer Science 2026-01-09 Chuanzhen Wang , Leo Zhang , Eric Liu

High-order tensor decomposition has been widely adopted to obtain compact deep neural networks for edge deployment. However, existing studies focus primarily on its algorithmic advantages such as accuracy and compression ratio-while…

Hardware Architecture · Computer Science 2025-11-26 Jinsong Zhang , Minghe Li , Jiayi Tian , Jinming Lu , Zheng Zhang

Vector space models for symbolic processing that encode symbols by random vectors have been proposed in cognitive science and connectionist communities under the names Vector Symbolic Architecture (VSA), and, synonymously, Hyperdimensional…

Machine Learning · Computer Science 2021-09-09 E. Paxon Frady , Denis Kleyko , Christopher J. Kymn , Bruno A. Olshausen , Friedrich T. Sommer
‹ Prev 1 2 3 10 Next ›