English
Related papers

Related papers: Remarks on "Random Sequences"

200 papers

We define a notion of randomness for individual and collections of formal languages based on automatic martingales acting on sequences of words from some underlying domain. An automatic martingale bets if the incoming word belongs to the…

Formal Languages and Automata Theory · Computer Science 2018-02-20 Birzhan Moldagaliyev

In this paper, we describe an approach to sentence categorization which has the originality to be based on natural properties of languages with no training set dependency. The implementation is fast, small, robust and textual errors…

cmp-lg · Computer Science 2016-08-31 Emmanuel Giguet

Traditional linguistic theories have largely regard language as a formal system composed of rigid rules. However, their failures in processing real language, the recent successes in statistical natural language processing, and the findings…

Computation and Language · Computer Science 2020-12-02 Shuiyuan Yu , Chunshan Xu , Haitao Liu

The established language for statistical testing --- significance levels, power, and p-values --- is overly complicated and deceptively conclusive. Even teachers of statistics and scientists who use statistics misinterpret the results of…

Statistics Theory · Mathematics 2019-10-23 Glenn Shafer

Based on data from a large-scale experiment with human subjects, we conclude that the logarithm of probability to guess a word in context (unpredictability) depends linearly on the word length. This result holds both for poetry and prose,…

Information Theory · Computer Science 2007-07-16 Dmitrii Manin

Stanford typed dependencies are a widely desired representation of natural language sentences, but parsing is one of the major computational bottlenecks in text analysis systems. In light of the evolving definition of the Stanford…

Computation and Language · Computer Science 2014-04-17 Lingpeng Kong , Noah A. Smith

The words of a language are randomly replaced in time by new ones, but it has long been known that words corresponding to some items (meanings) are less frequently replaced than others. Usually, the rate of replacement for a given item is…

Computation and Language · Computer Science 2018-10-24 Michele Pasquini , Maurizio Serva

Lexical resemblances among a group of languages indicate that the languages could be genetically related, i.e., they could have descended from a common ancestral language. However, such resemblances can arise by chance and, hence, need not…

Computation and Language · Computer Science 2024-04-02 V. S. D. S. Mahesh Akavarapu , Arnab Bhattacharya

Some asymptotic notions for random variables are discussed. In particular, different versions of O and o for sequences of random variables are studied. The results are elementary and more or less well-known, but collected here for future…

Probability · Mathematics 2011-08-22 Svante Janson

This paper presents a comparison of the quality of randomness of D sequences based on diehard tests. Since D sequences can model any random sequence, this comparison is of value beyond this specific class.

Numerical Analysis · Computer Science 2013-12-13 James Bellamy

In this paper we propose several variants to perform the independence test between two random elements based on recurrence rates. We will show how to calculate the test statistic in each one of these cases. From simulations we obtain that…

Methodology · Statistics 2020-09-21 Juan Kalemkerian , Diego Fernández

Are pairs of words that tend to occur together also likely to stand in a linguistic dependency? This empirical question is motivated by a long history of literature in cognitive science, psycholinguistics, and NLP. In this work we…

Computation and Language · Computer Science 2022-05-02 Jacob Louis Hoover , Alessandro Sordoni , Wenyu Du , Timothy J. O'Donnell

We analyze the frequency-rank relationship in sub-vocabularies corresponding to three different grammatical classes (nouns, verbs, and others) in a collection of literary works in English, whose words have been automatically tagged…

Computation and Language · Computer Science 2021-06-11 A. Chacoma , D. H. Zanette

The statistical properties of letters frequencies in European literature texts are investigated. The determination of logarithmic dependence of letters sequence for one-language and two-language texts are examined. The pare of languages is…

Transformer models are now a cornerstone in natural language processing. Yet, explaining their decisions remains a challenge. It was shown recently that the same model trained on the same data with a different randomness can lead to very…

Computation and Language · Computer Science 2026-03-10 Romain Loncour , Jérémie Bogaert , François-Xavier Standaert

We present a theoretical and empirical investigation of the statistical behaviour of the words in a text produced by human language. To this aim, we analyse the word distribution of various texts of Italian language selected from a specific…

Neurons and Cognition · Quantitative Biology 2025-04-15 Diederik Aerts , Jonito Aerts Arguëlles , Lester Beltran , Massimiliano Sassoli de Bianchi , Sandro Sozzo

The paper presents a language model that develops syntactic structure and uses it to extract meaningful information from the word history, thus enabling the use of long distance dependencies. The model assigns probability to every joint…

Computation and Language · Computer Science 2007-05-23 Ciprian Chelba

We consider the conditional randomization test as a way to account for covariate imbalance in randomized experiments. The test accounts for covariate imbalance by comparing the observed test statistic to the null distribution of the test…

Language change is a cultural evolutionary process in which variants of linguistic variables change in frequency through processes analogous to mutation, selection and genetic drift. In this work, we apply a recently-introduced method to…

Computation and Language · Computer Science 2023-08-22 Juan Guerrero Montero , Andres Karjus , Kenny Smith , Richard A. Blythe

In this paper new test statistics are introduced and studied for the important problem of testing hypothesis that involves inequality constraint on proportions when the sample comes from independent binomial random variables: Wald type and…

Methodology · Statistics 2014-02-28 Nirian Martín , Raquel Mata , Leando Pardo
‹ Prev 1 2 3 10 Next ›