English
Related papers

Related papers: Accelerating Viterbi Algorithm using Custom Instru…

200 papers

This paper presents a novel, non-standard set of vector instruction types for exploring custom SIMD instructions in a softcore. The new types allow simultaneous access to a relatively high number of operands, reducing the instruction count…

Hardware Architecture · Computer Science 2021-06-15 Philippos Papaphilippou , Paul H. J. Kelly , Wayne Luk

Specialized Deep Learning (DL) acceleration stacks, designed for a specific set of frameworks, model architectures, operators, and data types, offer the allure of high performance while sacrificing flexibility. Changes in algorithms,…

The Viterbi algorithm is a key operator for structured sequence inference in modern data systems, with applications in trajectory analysis, online recommendation, and speech recognition. As these workloads increasingly migrate to…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-24 Ziheng Deng , Xue Liu , Jiantong Jiang , Yankai Li , Qingxu Deng , Xiaochun Yang

Recent advancements in quantization and mixed-precision approaches offers substantial opportunities to improve the speed and energy efficiency of Neural Networks (NN). Research has shown that individual parameters with varying low…

Hardware Architecture · Computer Science 2024-08-14 Giorgos Armeniakos , Alexis Maras , Sotirios Xydis , Dimitrios Soudris

The enhanced efficiency of hardware accelerators, including Single Instruction Multiple Data (SIMD) architectures and Coarse-Grained Reconfigurable Architectures (CGRAs), is driving significant advancements in Artificial Intelligence and…

Hardware Architecture · Computer Science 2025-04-29 Yu Yang , Jordi Altayó González , Paul Delestrac , Ahmed Hemani

For years, the open-source RISC-V instruction set has been driving innovation in processor design, spanning from high-end cores to low-cost or low-power cores. After a decade of evolution, RISC architectures are now as mature as the CISC…

Hardware Architecture · Computer Science 2024-06-24 Juliette Pottier , Thomas Nieddu , Bertrand Le Gal , Sébastien Pillement , Maria Méndez Real

The most famous error-decoding algorithm for convolutional codes is the Viterbi algorithm. In this paper, we present a new reduced complexity version of this algorithm which can be applied to a class of binary convolutional codes with…

Information Theory · Computer Science 2024-07-26 Zita Abreu , Julia Lieb , Michael Schaller

A novel adaptive binary decoding algorithm for LDPC codes is proposed, which reduces the decoding complexity while having a comparable or even better performance than corresponding non-adaptive alternatives. In each iteration the variable…

Information Theory · Computer Science 2009-04-24 Ingmar Land , Gottfried Lechner , Lars K. Rasmussen

This paper describes a parallel implementation of Viterbi decoding algorithm. Viterbi decoder is widely used in many state-of-the-art wireless systems. The proposed solution optimizes both throughput and memory usage by applying…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-19 Alireza Mohammadidoost , Matin Hashemi

The rise of hardware accelerators with custom instructions necessitates custom compiler backends supporting these accelerators. This study provides detailed analyses of LLVM and its RISC-V backend, supplemented with case studies providing…

Hardware Architecture · Computer Science 2023-10-31 Eymen Ünay , Bora İnan , Emrecan Yiğit

Processors with extensible instruction sets are often used today as programmable hardware accelerators for various domains. When extending RISC-V and other similar extensible processor architectures, the task of designing specialized…

Hardware Architecture · Computer Science 2024-01-02 Peter Sovietov

The use of deep neural network for decoding error control code will encounter two problems, namely, the high-precision requirements of the error control code and the complexity of the neural network due to the long code. In this paper, a…

Signal Processing · Electrical Eng. & Systems 2019-01-01 Jiang Xiaobo , Zhang Fang , Zeng Zhen

While Transformers are dominated by Floating-Point (FP) Matrix-Multiplications, their aggressive acceleration through dedicated hardware or many-core programmable systems has shifted the performance bottleneck to non-linear functions like…

Hardware Architecture · Computer Science 2025-04-16 Run Wang , Gamze Islamoglu , Andrea Belano , Viviane Potocnik , Francesco Conti , Angelo Garofalo , Luca Benini

This paper presents an automated approach for designing processors that support a subset of the RISC-V instruction set architecture (ISA) for a new class of applications at Extreme Edge. The electronics used in extreme edge applications…

Hardware Architecture · Computer Science 2025-10-29 Alireza Raisiardali , Konstantinos Iordanou , Jedrzej Kufel , Kowshik Gudimetla , Kris Myny , Emre Ozer

In order to meet the requirement of high data rates for the next generation wireless systems, the efficient implementation of receiver algorithms is essential. On the other hand, the rapid development of technology motivates the…

Hardware Architecture · Computer Science 2015-01-20 Shahriar Shahabuddin , Janne Janhunen , Markku Juntti

Procedural planning aims to predict a sequence of actions that transforms an initial visual state into a desired goal, a fundamental ability for intelligent agents operating in complex environments. Existing approaches typically rely on…

Computer Vision and Pattern Recognition · Computer Science 2026-03-05 Luigi Seminara , Davide Moltisanti , Antonino Furnari

Integrating cryptographic accelerators into modern CPU architectures presents unique microarchitectural challenges, particularly when extending instruction sets with complex and multistage operations. Hardware-assisted cryptographic…

Hardware Architecture · Computer Science 2025-08-29 Alperen Bolat , Sakir Sezer , Kieran McLaughlin , Henry Hui

We present a quantum Viterbi algorithm (QVA) with better than classical performance under certain conditions. In this paper the proposed algorithm is applied to decoding classical convolutional codes, for instance; large constraint length…

Quantum Physics · Physics 2015-06-23 Jon R. Grice , David A. Meyer

This report makes the case that a well-designed Reduced Instruction Set Computer (RISC) can match, and even exceed, the performance and code density of existing commercial Complex Instruction Set Computers (CISC) while maintaining the…

Hardware Architecture · Computer Science 2016-07-11 Christopher Celio , Palmer Dabbelt , David A. Patterson , Krste Asanović

The development of personalized recommendation has significantly improved the accuracy of information matching and the revenue of e-commerce platforms. Recently, it has 2 trends: 1) recommender systems must be trained timely to cope with…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-04-19 Yuanxing Zhang , Langshi Chen , Siran Yang , Man Yuan , Huimin Yi , Jie Zhang , Jiamang Wang , Jianbo Dong , Yunlong Xu , Yue Song , Yong Li , Di Zhang , Wei Lin , Lin Qu , Bo Zheng
‹ Prev 1 2 3 10 Next ›