What's in a Name?
Abstract
This paper describes experiments on identifying the language of a single name in isolation or in a document written in a different language. A new corpus has been compiled and made available, matching names against languages. This corpus is used in a series of experiments measuring the performance of general language models and names-only language models on the language identification task. Conclusions are drawn from the comparison between using general language models and names-only language models and between identifying the language of isolated names and the language of very short document fragments. Future research directions are outlined.
Keywords
Cite
@article{arxiv.0710.1481,
title = {What's in a Name?},
author = {Stasinos Konstantopoulos},
journal= {arXiv preprint arXiv:0710.1481},
year = {2007}
}
Comments
Presented at the Computational Phonology Workshop, 6th Intl. Conf. Recent Advances in NLP, Borovets, Bulgaria, September 2007