English

On the Entropy of Written Spanish

Computation and Language 2013-01-15 v1 Information Theory math.IT

Abstract

This paper reports on results on the entropy of the Spanish language. They are based on an analysis of natural language for n-word symbols (n = 1 to 18), trigrams, digrams, and characters. The results obtained in this work are based on the analysis of twelve different literary works in Spanish, as well as a 279917 word news file provided by the Spanish press agency EFE. Entropy values are calculated by a direct method using computer processing and the probability law of large numbers. Three samples of artificial Spanish language produced by a first-order model software source are also analyzed and compared with natural Spanish language.

Keywords

Cite

@article{arxiv.0901.4784,
  title  = {On the Entropy of Written Spanish},
  author = {Fabio G. Guerrero},
  journal= {arXiv preprint arXiv:0901.4784},
  year   = {2013}
}

Comments

Submitted to the IEEE Transactions on Information Theory

R2 v1 2026-06-21T12:06:08.004Z