English

Semantic Ordered Statistics Decoding

Information Theory 2026-05-05 v1 math.IT

Abstract

We propose a Semantic Ordered Statistics Decoder (sem-OSD), a soft decoder for short linear block codes carrying byte-streamed sources such as natural-language text. Sem-OSD injects a byte-level language-model (LM) prior into ordered statistics decoding (OSD) through a fused bit-level score that combines channel reliability with the LM prior, and uses it for the most-reliable basis (MRB) selection and the codeword candidate scoring. Sem-OSD enumerates two complementary test-error-pattern (TEP) families: a bit-flip family that flips up to mm bits, and an LM-driven family of up to ω\omega byte substitutions that reaches error patterns the bit-flip family cannot. The LM prior is computed by a byte-level Transformer fine-tuned for byte-level denoising. Simulation results show that, on AWGN, sem-OSD achieves block error rates (BLERs) below the finite-blocklength normal-approximation bound for uniform sources on both binary BCH(127,64)(127,64) and shortened RS(16,8)(16,8) over GF(256), exceeding Fossorier OSD by a 1.51.5 dB coding gain. On a Gilbert--Elliott burst-error channel, sem-OSD provides 44 dB and 11 dB of more coding gain than Berlekamp--Massey and OSD, respectively.

Keywords

Cite

@article{arxiv.2605.02296,
  title  = {Semantic Ordered Statistics Decoding},
  author = {Chentao Yue and Branka Vucetic and Yonghui Li},
  journal= {arXiv preprint arXiv:2605.02296},
  year   = {2026}
}

Comments

6 pages, submitted to IEEE Globecom for possible publication