English
Related papers

Related papers: Google distance between words

200 papers

Words and phrases acquire meaning from the way they are used in society, from their relative semantics to other words and phrases. For computers the equivalent of `society' is `database,' and the equivalent of `use' is `way to search the…

Computation and Language · Computer Science 2007-06-13 Rudi Cilibrasi , Paul M. B. Vitanyi

First we consider pair-wise distances for literal objects consisting of finite binary files. These files are taken to contain all of their meaning, like genomes or books. The distances are based on compression of the objects concerned,…

Information Theory · Computer Science 2011-10-21 Paul M. B. Vitanyi

A set of ontology matching algorithms (for finding correspondences between concepts) is based on a thesaurus that provides the source data for the semantic distance calculations. In this wiki era, new resources may spring up and improve…

Information Retrieval · Computer Science 2009-10-12 A. A. Krizhanovsky , Feiyu Lin

We introduce the notion of the 'meaning bound' of a word with respect to another word by making use of the World-Wide Web as a conceptual environment for meaning. The meaning of a word with respect to another word is established by…

Artificial Intelligence · Computer Science 2012-03-28 Diederik Aerts

There is a great deal of work in cognitive psychology, linguistics, and computer science, about using word (or phrase) frequencies in context in text corpora to develop measures for word similarity or word association, going back to at…

Computation and Language · Computer Science 2009-05-26 Rudi L. Cilibrasi , Paul M. B. Vitanyi

The majority of Semantic Web search engines retrieve information by focusing on the use of concepts and relations restricted to the query provided by the user. By trying to guess the implicit meaning between these concepts and relations,…

Information Retrieval · Computer Science 2012-11-28 Manuel Rojas

We survey the emerging area of compression-based, parameter-free, similarity distance measures useful in data-mining, pattern recognition, learning and automatic semantics extraction. Given a family of distances on a set of objects, a…

Computer Vision and Pattern Recognition · Computer Science 2007-05-23 Rudi Cilibrasi , Paul Vitanyi

Most research related to unithood were conducted as part of a larger effort for the determination of termhood. Consequently, novelties are rare in this small sub-field of term extraction. In addition, existing work were mostly empirically…

Artificial Intelligence · Computer Science 2008-10-02 Wilson Wong , Wei Liu , Mohammed Bennamoun

This paper proposes a method for measuring semantic similarity between words as a new tool for text analysis. The similarity is measured on a semantic network constructed systematically from a subset of the English dictionary, LDOCE…

cmp-lg · Computer Science 2008-02-03 Hideki Kozima , Teiji Furugori

We survey a new area of parameter-free similarity distance measures useful in data-mining, pattern recognition, learning and automatic semantics extraction. Given a family of distances on a set of objects, a distance is universal up to a…

Information Retrieval · Computer Science 2007-05-23 Paul Vitanyi

Normalized Google distance (NGD) is a relative semantic distance based on the World Wide Web (or any other large electronic database, for instance Wikipedia) and a search engine that returns aggregate page counts. The earlier NGD between…

Information Retrieval · Computer Science 2013-08-15 Andrew R. Cohen , P. M. B. Vitanyi

We propose a new similarity measure between texts which, contrary to the current state-of-the-art approaches, takes a global view of the texts to be compared. We have implemented a tool to compute our textual distance and conducted…

Computation and Language · Computer Science 2014-05-15 Uli Fahrenberg , Fabrizio Biondi , Kevin Corre , Cyrille Jegourel , Simon Kongshøj , Axel Legay

There have been several efforts to extend distributional semantics beyond individual words, to measure the similarity of word pairs, phrases, and sentences (briefly, tuples; ordered sets of words, contiguous or noncontiguous). One way to…

Machine Learning · Computer Science 2013-10-21 Peter D. Turney

Normalized web distance (NWD) is a similarity or normalized semantic distance based on the World Wide Web or another large electronic database, for instance Wikipedia, and a search engine that returns reliable aggregate page counts. For…

Information Retrieval · Computer Science 2020-07-24 Andrew R. Cohen , Paul M. B. Vitanyi

Measuring the distance between concepts is an important field of study of Natural Language Processing, as it can be used to improve tasks related to the interpretation of those same concepts. WordNet, which includes a wide variety of…

Computation and Language · Computer Science 2018-04-30 Raquel Pérez-Arnal , Armand Vilalta , Dario Garcia-Gasulla , Ulises Cortés , Eduard Ayguadé , Jesus Labarta

Compositionality in language refers to how much the meaning of some phrase can be decomposed into the meaning of its constituents and the way these constituents are combined. Based on the premise that substitution by synonyms is…

Computation and Language · Computer Science 2017-03-13 Christina Lioma , Niels Dalum Hansen

Increasingly, critical decisions in public policy, governance, and business strategy rely on a deeper understanding of the needs and opinions of constituent members (e.g. citizens, shareholders). While it has become easier to collect a…

Computation and Language · Computer Science 2020-01-28 Saket Gurukar , Deepak Ajwani , Sourav Dutta , Juho Lauri , Srinivasan Parthasarathy , Alessandra Sala

Most works related to unithood were conducted as part of a larger effort for the determination of termhood. Consequently, the number of independent research that study the notion of unithood and produce dedicated techniques for measuring…

Artificial Intelligence · Computer Science 2008-10-02 Wilson Wong , Wei Liu , Mohammed Bennamoun

In many applications of natural language processing (NLP) it is necessary to determine the likelihood of a given word combination. For example, a speech recognizer may need to determine which of the two word combinations ``eat a peach'' and…

Computation and Language · Computer Science 2007-05-23 Ido Dagan , Lillian Lee , Fernando C. N. Pereira

The rapid growth of web has resulted in vast volume of information. Information availability at a rapid speed to the user is vital. English language (or any for that matter) has lot of ambiguity in the usage of words. So there is no…

Information Retrieval · Computer Science 2011-08-30 Jeevan H E , Prashanth P P , Punith Kumar S N , Vinay Hegde
‹ Prev 1 2 3 10 Next ›