Related papers: Utilisation des grammaires probabilistes dans les …

Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation

We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and prosodic information using hidden Markov…

Computation and Language · Computer Science 2022-02-28 G. Tur , D. Hakkani-Tur , A. Stolcke , E. Shriberg

Parsing with Principles and Probabilities

This paper is an attempt to bring together two approaches to language analysis. The possible use of probabilistic information in principle-based grammars and parsers is considered, including discussion on some theoretical and computational…

cmp-lg · Computer Science 2008-02-03 Andrew Fordham , Matthew Crocker

Prosody-Based Automatic Segmentation of Speech into Sentences and Topics

A crucial step in processing speech audio data for information extraction, topic detection, or browsing/playback is to segment the input into sentence and topic units. Speech segmentation is challenging, since the cues typically present for…

Computation and Language · Computer Science 2022-02-28 E. Shriberg , A. Stolcke , D. Hakkani-Tur , G. Tur

An Efficient, Probabilistically Sound Algorithm for Segmentation and Word Discovery

This paper presents a model-based, unsupervised algorithm for recovering word boundaries in a natural-language text from which they have been deleted. The algorithm is derived from a probability model of the source that generated the text.…

Computation and Language · Computer Science 2007-05-23 Michael R. Brent

Sequence Modeling via Segmentations

Segmental structure is a common pattern in many types of sequences such as phrases in human languages. In this paper, we present a probabilistic model for sequences via their segmentations. The probability of a segmented sequence is…

Machine Learning · Statistics 2018-07-20 Chong Wang , Yining Wang , Po-Sen Huang , Abdelrahman Mohamed , Dengyong Zhou , Li Deng

Joint Semantic Synthesis and Morphological Analysis of the Derived Word

Much like sentences are composed of words, words themselves are composed of smaller units. For example, the English word questionably can be analyzed as question+able+ly. However, this structural decomposition of the word does not directly…

Computation and Language · Computer Science 2018-11-13 Ryan Cotterell , Hinrich Schütze

Stochastic phonological grammars and acceptability

In foundational works of generative phonology it is claimed that subjects can reliably discriminate between possible but non-occurring words and words that could not be English. In this paper we examine the use of a probabilistic…

cmp-lg · Computer Science 2008-02-03 John Coleman , Janet Pierrehumbert

On Structured Sparsity of Phonological Posteriors for Linguistic Parsing

The speech signal conveys information on different time scales from short time scale or segmental, associated to phonological and phonetic information to long time scale or supra segmental, associated to syllabic and prosodic information.…

Computation and Language · Computer Science 2016-09-16 Milos Cernak , Afsaneh Asaei , Hervé Bourlard

A Structured Language Model

The paper presents a language model that develops syntactic structure and uses it to extract meaningful information from the word history, thus enabling the use of long distance dependencies. The model assigns probability to every joint…

Computation and Language · Computer Science 2007-05-23 Ciprian Chelba

Probabilistic Grammars for Equation Discovery

Equation discovery, also known as symbolic regression, is a type of automated modeling that discovers scientific laws, expressed in the form of equations, from observed data and expert knowledge. Deterministic grammars, such as context-free…

Machine Learning · Computer Science 2021-04-29 Jure Brence , Ljupčo Todorovski , Sašo Džeroski

Expoiting Syntactic Structure for Language Modeling

The paper presents a language model that develops syntactic structure and uses it to extract meaningful information from the word history, thus enabling the use of long distance dependencies. The model assigns probability to every joint…

Computation and Language · Computer Science 2007-05-23 Ciprian Chelba , Frederick Jelinek

A probabilistic top-down parser for minimalist grammars

This paper describes a probabilistic top-down parser for minimalist grammars. Top-down parsers have the great advantage of having a certain predictive power during the parsing, which takes place in a left-to-right reading of the sentence.…

Computation and Language · Computer Science 2010-10-12 Thomas Mainguy

Robust Probabilistic Predictive Syntactic Processing

This thesis presents a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The parser builds fully connected derivations incrementally, in a single pass from…

Computation and Language · Computer Science 2007-05-23 Brian Roark

A Statistical Model for Word Discovery in Transcribed Speech

A statistical model for segmentation and word discovery in continuous speech is presented. An incremental unsupervised learning algorithm to infer word boundaries based on this model is described. Results of empirical tests showing that the…

Computation and Language · Computer Science 2007-05-23 Anand Venkataraman

Towards History-based Grammars: Using Richer Models for Probabilistic Parsing

We describe a generative probabilistic model of natural language, which we call HBG, that takes advantage of detailed linguistic information to resolve ambiguity. HBG incorporates lexical, syntactic, semantic, and structural information…

cmp-lg · Computer Science 2008-02-03 Ezra Black , Fred Jelinek , John Lafferty , David M. Magerman , Robert Mercer , Salim Roukos

Double Articulation Analyzer with Prosody for Unsupervised Word and Phoneme Discovery

Infants acquire words and phonemes from unsegmented speech signals using segmentation cues, such as distributional, prosodic, and co-occurrence cues. Many pre-existing computational models that represent the process tend to focus on…

Computation and Language · Computer Science 2023-01-18 Yasuaki Okuda , Ryo Ozaki , Tadahiro Taniguchi

Implicit Representations of Grammaticality in Language Models

Grammaticality and likelihood are distinct notions in human language. Pretrained language models (LMs), which are probabilistic models of language fitted to maximize corpus likelihood, generate grammatically well-formed text and…

Computation and Language · Computer Science 2026-05-07 Yingshan Susan Wang , Linlu Qiu , Zhaofeng Wu , Roger P. Levy , Yoon Kim

Segmenting speech without a lexicon: The roles of phonotactics and speech source

Infants face the difficult problem of segmenting continuous speech into words without the benefit of a fully developed lexicon. Several sources of information in speech might help infants solve this problem, including prosody, semantic…

cmp-lg · Computer Science 2008-02-03 Timothy Andrew Cartwright , Michael R. Brent

Robust Parsing of Spoken Dialogue Using Contextual Knowledge and Recognition Probabilities

In this paper we describe the linguistic processor of a spoken dialogue system. The parser receives a word graph from the recognition module as its input. Its task is to find the best path through the graph. If no complete solution can be…

cmp-lg · Computer Science 2008-02-03 Gerhard Hanrieder , Guenther Goerz

Prefix Probabilities from Stochastic Tree Adjoining Grammars

Language models for speech recognition typically use a probability model of the form Pr(a_n | a_1, a_2, ..., a_{n-1}). Stochastic grammars, on the other hand, are typically used to assign structure to utterances. A language model of the…

Computation and Language · Computer Science 2007-05-23 Mark-Jan Nederhof , Anoop Sarkar , Giorgio Satta