Related papers: Fast LDPC GPU Decoder for Cloud RAN
This paper presents a partially parallel low-density parity-check (LDPC) decoder designed for the 5G New Radio (NR) standard. The design is using a multi-block parallel architecture with a flooding schedule. The decoder can support any code…
Emerging virtualized radio access networks (vRANs) demand flexible and efficient baseband processing across heterogeneous compute substrates. In this paper, we present DecodeX, a unified benchmarking framework for evaluating low-density…
Ultra-reliable low-latency vehicular communications (URLLC) require sufficient physical-layer (PHY) compute headroom at the network edge, where roadside units (RSUs) and compact next-generation base stations (gNBs) must meet strict timing…
This paper presents a GPU-accelerated decoder for quantum low-density parity-check (QLDPC) codes that achieves sub-$63$ $\mu$s latency, below the surface code decoder's real-time threshold demonstrated on Google's Willow quantum processor.…
Low-density parity check (LDPC) codes have been extensively applied in mobile communication systems due to their excellent error correcting capabilities. However, their broad adoption has been hindered by the high complexity of the LDPC…
The decoding throughput in the postprocessing is one of the bottlenecks for a continuous-variable quantum key distribution (CV-QKD) system. In this paper, we propose a layered decoder to decode quasi-cyclic multi-edge type LDPC (QC-METLDPC)…
In the next decade, the demands for computing in large scientific experiments are expected to grow tremendously. During the same time period, CPU performance increases will be limited. At the CERN Large Hadron Collider (LHC), these two…
With the use of belief propagation (BP) decoding algorithm, low-density parity-check (LDPC) codes can achieve near-Shannon limit performance. In order to evaluate the error performance of LDPC codes, simulators running on CPUs are commonly…
This paper presents a computationally efficient implementation of a Hamming code decoder on a graphics processing unit (GPU) to support real-time software-defined radio (SDR), which is a software alternative for realizing wireless…
Many research works have been performed on implementation of Vitrerbi decoding algorithm on GPU instead of FPGA because this platform provides considerable flexibility in addition to great performance. Recently, the recently-introduced…
FPGA-based hardware accelerators for convolutional neural networks (CNNs) have obtained great attentions due to their higher energy efficiency than GPUs. However, it is challenging for FPGA-based solutions to achieve a higher throughput…
Deep learning-based point cloud processing plays an important role in various vision tasks, such as autonomous driving, virtual reality (VR), and augmented reality (AR). The submanifold sparse convolutional network (SSCN) has been widely…
Elegant is an accelerator physics and particle-beam dynamics code widely used for modeling and design of a variety of high-energy particle accelerators and accelerator-based systems. In this paper we discuss a recently developed version of…
We present a versatile GPU-based parallel version of Logistic Regression (LR), aiming to address the increasing demand for faster algorithms in binary classification due to large data sets. Our implementation is a direct translation of the…
We consider generalized low-density parity-check (GLDPC) codes with component codes that are duals of Cordaro-Wagner codes. Two efficient decoding algorithms are proposed: one based on Hartmann-Rudolph processing, analogous to Sum-Product…
Parallel computing can offer an enormous advantage regarding the performance for very large applications in almost any field: scientific computing, computer vision, databases, data mining, and economics. GPUs are high performance many-core…
5G New Radio (NR) has stringent demands on both performance and complexity for the design of low-density parity-check (LDPC) decoding algorithms and corresponding VLSI implementations. Furthermore, decoders must fully support the wide range…
This work introduces a novel, fully differentiable linear-time complexity transformer decoder and a transformer decoder to correct 5G New Radio (NR) LDPC. We propose a scalable approach to decode linear block codes with $O(n)$ complexity…
Graph Neural Networks (GNNs) use a fully-connected layer to extract features from the nodes of a graph and aggregate these features using message passing between nodes, combining two distinct computational patterns: dense, regular…
The identification, and subsequent discovery, of fast radio transients through blind-search surveys requires a large amount of processing power, in worst cases scaling as $\mathcal{O}(N^3)$. For this reason, survey data are generally…