Google distance between words
Computation and Language
2015-01-29 v2
Abstract
Cilibrasi and Vitanyi have demonstrated that it is possible to extract the meaning of words from the world-wide web. To achieve this, they rely on the number of webpages that are found through a Google search containing a given word and they associate the page count to the probability that the word appears on a webpage. Thus, conditional probabilities allow them to correlate one word with another word's meaning. Furthermore, they have developed a similarity distance function that gauges how closely related a pair of words is. We present a specific counterexample to the triangle inequality for this similarity distance function.
Keywords
Cite
@article{arxiv.0901.4180,
title = {Google distance between words},
author = {Bjørn Kjos-Hanssen and Alberto J. Evangelista},
journal= {arXiv preprint arXiv:0901.4180},
year = {2015}
}
Comments
Presented at Frontiers in Undergraduate Research, University of Connecticut, 2006