English

Google distance between words

Computation and Language 2015-01-29 v2

Abstract

Cilibrasi and Vitanyi have demonstrated that it is possible to extract the meaning of words from the world-wide web. To achieve this, they rely on the number of webpages that are found through a Google search containing a given word and they associate the page count to the probability that the word appears on a webpage. Thus, conditional probabilities allow them to correlate one word with another word's meaning. Furthermore, they have developed a similarity distance function that gauges how closely related a pair of words is. We present a specific counterexample to the triangle inequality for this similarity distance function.

Keywords

Cite

@article{arxiv.0901.4180,
  title  = {Google distance between words},
  author = {Bjørn Kjos-Hanssen and Alberto J. Evangelista},
  journal= {arXiv preprint arXiv:0901.4180},
  year   = {2015}
}

Comments

Presented at Frontiers in Undergraduate Research, University of Connecticut, 2006

R2 v1 2026-06-21T12:04:59.549Z