We describe a statistical model over linguistic areas and phylogeny. Our model recovers known areas and identifies a plausible hierarchy of areal features. The use of areas improves genetic reconstruction of languages both qualitatively and quantitatively according to a variety of metrics. We model linguistic areas by a Pitman-Yor process and linguistic phylogeny by Kingman's coalescent.
Cite
@article{arxiv.0906.5114,
title = {Non-Parametric Bayesian Areal Linguistics},
author = {Hal Daumé},
journal= {arXiv preprint arXiv:0906.5114},
year = {2009}
}