English
Related papers

Related papers: Non-Parametric Bayesian Areal Linguistics

200 papers

Kingman's coalescent is one of the most popular models in population genetics. It describes the genealogy of a population whose genetic composition evolves in time according to the Wright-Fisher model, or suitable approximations of it…

Methodology · Statistics 2018-04-20 Stefano Favaro , Shui Feng , Paul A. Jenkins

Is it possible to develop a `physics of language' which can explain the spatial, temporal and social patterns we see, and which can predict future change like we forecast the weather? Such a theory is likely to involve ideas from…

Physics and Society · Physics 2025-12-22 James Burridge

Pretrained language models (PLMs) have become remarkably adept at task and language generalization. Nonetheless, they often fail when faced with unseen languages. In this work, we present LinguAlchemy, a regularization method that…

Computation and Language · Computer Science 2024-10-07 Muhammad Farid Adilazuarda , Samuel Cahyawijaya , Alham Fikri Aji , Genta Indra Winata , Ayu Purwarianti

In audio signal processing, probabilistic time-frequency models have many benefits over their non-probabilistic counterparts. They adapt to the incoming signal, quantify uncertainty, and measure correlation between the signal's amplitude…

Signal Processing · Electrical Eng. & Systems 2019-02-13 William J. Wilkinson , Michael Riis Andersen , Joshua D. Reiss , Dan Stowell , Arno Solin

In this article we propose a novel method to estimate the frequency distribution of linguistic variables while controlling for statistical non-independence due to shared ancestry. Unlike previous approaches, our technique uses all available…

Populations and Evolution · Quantitative Biology 2021-03-22 Gerhard Jäger , Johannes Wahle

In this thesis, we investigate three problems involving the probabilistic modeling of language: smoothing n-gram models, statistical grammar induction, and bilingual sentence alignment. These three problems employ models at three different…

cmp-lg · Computer Science 2008-02-03 Stanley F. Chen

This paper presents the beginnings of an automatic statistician, focusing on regression problems. Our system explores an open-ended space of statistical models to discover a good explanation of a data set, and then produces a detailed…

Machine Learning · Statistics 2014-04-25 James Robert Lloyd , David Duvenaud , Roger Grosse , Joshua B. Tenenbaum , Zoubin Ghahramani

A neural probabilistic language model (NPLM) provides an idea to achieve the better perplexity than n-gram language model and their smoothed language models. This paper investigates application area in bilingual NLP, specifically…

Computation and Language · Computer Science 2017-04-24 Tsuyoshi Okita

Quantifying the speed of linguistic change is challenging due to the fact that the historical evolution of languages is sparsely documented. Consequently, traditional methods rely on phylogenetic reconstruction. In this paper, we propose a…

Physics and Society · Physics 2025-01-29 Henri Kauhanen , Deepthi Gopal , Tobias Galla , Ricardo Bermúdez-Otero

We propose a simple, scalable, fully generative model for transition-based dependency parsing with high accuracy. The model, parameterized by Hierarchical Pitman-Yor Processes, overcomes the limitations of previous generative models by…

Computation and Language · Computer Science 2015-06-30 Jan Buys , Phil Blunsom

The statistical over-representation of phonological features in the basic vocabulary of languages is often interpreted as reflecting potentially universal sound symbolic patterns. However, most of those results have not been tested…

Computation and Language · Computer Science 2026-03-03 Frederic Blum

The distribution of human linguistic groups presents a number of interesting and non-trivial patterns. The distributions of the number of speakers per language and the area each group covers follow log-normal distributions, while population…

Populations and Evolution · Quantitative Biology 2015-12-16 Jose A. Capitan , Susanna Manrubia

In this paper we develop a functorial language of probabilistic morphisms and apply it to some basic problems in Bayesian nonparametrics. First we extend and unify the Kleisli category of probabilistic morphisms proposed by Lawvere and Giry…

Statistics Theory · Mathematics 2021-04-27 Jürgen Jost , Hông Vân Lê , Tat Dat Tran

A simple machine learning model of pluralisation as a linear regression problem minimising a p-adic metric substantially outperforms even the most robust of Euclidean-space regressors on languages in the Indo-European, Austronesian, Trans…

Computation and Language · Computer Science 2022-11-24 Gregory Baker , Diego Molla-Aliod

This paper proposes a nonparametric Bayesian method for exploratory data analysis and feature construction in continuous time series. Our method focuses on understanding shared features in a set of time series that exhibit significant…

Machine Learning · Statistics 2010-08-13 Suchi Saria , Daphne Koller , Anna Penn

Tree structured graphical models are powerful at expressing long range or hierarchical dependency among many variables, and have been widely applied in different areas of computer science and statistics. However, existing methods for…

Machine Learning · Statistics 2014-01-17 Le Song , Han Liu , Ankur Parikh , Eric Xing

Natural language processing often involves computations with semantic or syntactic graphs to facilitate sophisticated reasoning based on structural relationships. While convolution kernels provide a powerful tool for comparing graph…

Computation and Language · Computer Science 2018-02-13 Sahil Garg , Greg Ver Steeg , Aram Galstyan

Large pretrained language models are critical components of modern NLP pipelines. Yet, they suffer from spurious correlations, poor out-of-domain generalization, and biases. Inspired by recent progress in causal machine learning, in…

Computation and Language · Computer Science 2022-11-16 Maxime Peyrard , Sarvjeet Singh Ghotra , Martin Josifoski , Vidhan Agarwal , Barun Patra , Dean Carignan , Emre Kiciman , Robert West

This paper proposes methods of predicting dynamic time series (including non-stationary ones) based on a linguistic approach, namely, the study of occurrences and repetition of so-called N-grams. This approach is used in computational…

Numerical Analysis · Mathematics 2026-02-26 Dmytro Lande , Volodymyr Yuzefovych , Yevheniia Tsybulska

We introduce the Probabilistic Worldbuilding Model (PWM), a new fully-symbolic Bayesian model of semantic parsing and reasoning, as a first step in a research program toward more domain- and task-general NLU and AI. Humans create internal…

Computation and Language · Computer Science 2021-12-22 Abulhair Saparov , Tom M. Mitchell
‹ Prev 1 2 3 10 Next ›