Related papers: Unsupervised Language Acquisition

A procedure for unsupervised lexicon learning

We describe an incremental unsupervised procedure to learn words from transcribed continuous speech. The algorithm is based on a conservative and traditional statistical model, and results of empirical tests show that it is competitive with…

Computation and Language · Computer Science 2007-05-23 Anand Venkataraman

The probabilistic analysis of language acquisition: Theoretical, computational, and experimental analysis

There is much debate over the degree to which language learning is governed by innate language-specific biases, or acquired through cognition-general principles. Here we examine the probabilistic language acquisition hypothesis on three…

Computation and Language · Computer Science 2010-06-17 Anne S. Hsu , Nick Chater , Paul M. B. Vitanyi

Unsupervised Language Acquisition: Theory and Practice

In this thesis I present various algorithms for the unsupervised machine learning of aspects of natural languages using a variety of statistical models. The scientific object of the work is to examine the validity of the so-called Argument…

Computation and Language · Computer Science 2007-05-23 Alexander Clark

One Model for the Learning of Language

A major target of linguistics and cognitive science has been to understand what class of learning systems can acquire the key structures of natural language. Until recently, the computational requirements of language have been used to argue…

Artificial Intelligence · Computer Science 2022-01-27 Yuan Yang

The Unsupervised Acquisition of a Lexicon from Continuous Speech

We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of…

cmp-lg · Computer Science 2008-02-03 Carl de Marcken

Unsupervised Acquisition of Discrete Grammatical Categories

This article presents experiments performed using a computational laboratory environment for language acquisition experiments. It implements a multi-agent system consisting of two agents: an adult language model and a daughter language…

Computation and Language · Computer Science 2025-12-16 David Ph. Shakouri , Crit Cremers , Niels O. Schiller

Computational modeling of early language learning from acoustic speech and audiovisual input without linguistic priors

Learning to understand speech appears almost effortless for typically developing infants, yet from an information-processing perspective, acquiring a language from acoustic speech is an enormous challenge. This chapter reviews recent…

Computation and Language · Computer Science 2026-03-12 Okko Räsänen

Learning to Learn Words from Visual Scenes

Language acquisition is the process of learning words from the surrounding scene. We introduce a meta-learning framework that learns how to learn word representations from unconstrained scenes. We leverage the natural compositional…

Computation and Language · Computer Science 2020-07-14 Dídac Surís , Dave Epstein , Heng Ji , Shih-Fu Chang , Carl Vondrick

Human Inspired Progressive Alignment and Comparative Learning for Grounded Word Acquisition

Human language acquisition is an efficient, supervised, and continual process. In this work, we took inspiration from how human babies acquire their first language, and developed a computational process for word acquisition through…

Computation and Language · Computer Science 2024-09-20 Yuwei Bao , Barrett Martin Lattimer , Joyce Chai

On the difficulty of a distributional semantics of spoken language

In the domain of unsupervised learning most work on speech has focused on discovering low-level constructs such as phoneme inventories or word-like units. In contrast, for written language, where there is a large body of work on…

Computation and Language · Computer Science 2018-10-29 Grzegorz Chrupała , Lieke Gelderloos , Ákos Kádár , Afra Alishahi

Word learning and the acquisition of syntactic--semantic overhypotheses

Children learning their first language face multiple problems of induction: how to learn the meanings of words, and how to build meaningful phrases from those words according to syntactic rules. We consider how children might solve these…

Computation and Language · Computer Science 2018-05-15 Jon Gauthier , Roger Levy , Joshua B. Tenenbaum

Review of Unsupervised POS Tagging and Its Implications on Language Acquisition

An ability that underlies human syntactic knowledge is determining which words can appear in the similar structures (i.e. grouping words by their syntactic categories). These groupings enable humans to combine structures in order to…

Computation and Language · Computer Science 2023-12-19 Niels Dickson

Unsupervised Learning of Word-Category Guessing Rules

Words unknown to the lexicon present a substantial problem to part-of-speech tagging. In this paper we present a technique for fully unsupervised statistical acquisition of rules which guess possible parts-of-speech for unknown words. Three…

cmp-lg · Computer Science 2008-02-03 Andrei Mikheev

Acquiring a Lexicon from Unsegmented Speech

We present work-in-progress on the machine acquisition of a lexicon from sentences that are each an unsegmented phone sequence paired with a primitive representation of meaning. A simple exploratory algorithm is described, along with the…

cmp-lg · Computer Science 2008-02-03 Carl de Marcken

Simulated Language Acquisition in a Biologically Realistic Model of the Brain

Despite tremendous progress in neuroscience, we do not have a compelling narrative for the precise way whereby the spiking of neurons in our brain results in high-level cognitive phenomena such as planning and language. We introduce a…

Neural and Evolutionary Computing · Computer Science 2025-07-17 Daniel Mitropolsky , Christos Papadimitriou

Learning Part-of-Speech Guessing Rules from Lexicon: Extension to Non-Concatenative Operations

One of the problems in part-of-speech tagging of real-word texts is that of unknown to the lexicon words. In Mikheev (ACL-96 cmp-lg/9604022), a technique for fully unsupervised statistical acquisition of rules which guess possible…

cmp-lg · Computer Science 2008-02-03 Andrei Mikheev

Universal linguistic inductive biases via meta-learning

How do learners acquire languages from the limited data available to them? This process must involve some inductive biases - factors that affect how a learner generalizes - but it is unclear which inductive biases can explain observed…

Computation and Language · Computer Science 2020-07-02 R. Thomas McCoy , Erin Grant , Paul Smolensky , Thomas L. Griffiths , Tal Linzen

A Theory of Language Learning

A theory of language learning is described, which uses Bayesian induction of feature structures (scripts) and script functions. Each word sense in a language is mentally represented by an m-script, a script function which embodies all the…

Computation and Language · Computer Science 2021-06-29 Robert Worden

Acoustic data-driven lexicon learning based on a greedy pronunciation selection framework

Speech recognition systems for irregularly-spelled languages like English normally require hand-written pronunciations. In this paper, we describe a system for automatically obtaining pronunciations of words for which pronunciations are not…

Computation and Language · Computer Science 2017-06-13 Xiaohui Zhang , Vimal Manohar , Daniel Povey , Sanjeev Khudanpur

A Probabilistic Approach to Lexical Semantic Knowledge Acquisition and S tructural Disambiguation

In this thesis, I address the problem of automatically acquiring lexical semantic knowledge, especially that of case frame patterns, from large corpus data and using the acquired knowledge in structural disambiguation. The approach I adopt…

Computation and Language · Computer Science 2007-05-23 Hang LI