English

Language Models Without a Trainable Input Embedding Table: Learning from Fixed Minimal Binary Token Codes

Computation and Language 2026-05-12 v1

Abstract

Trainable input embedding tables are a standard component of modern language models. We ask whether they are actually necessary at the input interface. For a vocabulary of size VV, exact token identity requires only K=log2VK=\lceil \log_2 V\rceil bits. We replace the usual trainable V×dmodelV\times d_{\text{model}} input embedding matrix with fixed minimal binary token codes and a zero-parameter lift to model width. In our main setting, V=65,536V=65{,}536, so K=16K=16, and tokens are represented by fixed 16-dimensional binary codes tiled to dmodel=1024d_{\text{model}}=1024. We also evaluate a fully table-free variant in which codes are generated from token IDs on the fly and randomly recoded by an invertible affine transform over F2K\mathbb{F}_2^K. Across matched 32-layer decoder-only models trained on approximately 17B tokens and evaluated over three independent training seeds, fixed minimal codes achieve comparable held-out validation perplexity to a standard learned-input baseline while removing 67.1M trainable input parameters. The fixed-code runs have a lower mean validation perplexity in our experiments, 2.36 versus 2.44, but the observed gap is within the measured seed-to-seed variation of 4.8\%; we therefore interpret the result as evidence that the trainable input table is not necessary, rather than as a statistically resolved superiority claim. The table-free affine-recoded variant remains close at 2.39 despite a slightly shorter training run. These results show that, in this regime, a trainable input embedding table is not necessary for useful language modeling. The output projection remains standard and trainable.

Keywords

Cite

@article{arxiv.2605.09751,
  title  = {Language Models Without a Trainable Input Embedding Table: Learning from Fixed Minimal Binary Token Codes},
  author = {A. Bochkov},
  journal= {arXiv preprint arXiv:2605.09751},
  year   = {2026}
}