English
Related papers

Related papers: A Fast Algorithm for Computing Prefix Probabilitie…

200 papers

We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities of successive prefixes being generated by…

cmp-lg · Computer Science 2008-02-03 Andreas Stolcke

A prefix normal word is a binary word with the property that no substring has more 1s than the prefix of the same length. This class of words is important in the context of binary jumbled pattern matching. In this paper we present an…

Data Structures and Algorithms · Computer Science 2014-06-23 Péter Burcsi , Gabriele Fici , Zsuzsanna Lipták , Frank Ruskey , Joe Sawada

We describe an algorithm computing an optimal prefix free code for $n$ unsorted positive weights in time within $O(n(1+\lg \alpha))\subseteq O(n\lg n)$, where the alternation $\alpha\in[1..n-1]$ measures the amount of sorting required by…

Data Structures and Algorithms · Computer Science 2016-02-02 Jérémy Barbay

We describe an algorithm computing an optimal prefix free code from $N$ unsorted positive integer weights in time linear in the number of machine words holding those weights. This algorithm takes advantage of common non-algebraic…

Data Structures and Algorithms · Computer Science 2017-03-02 Jérémy Barbay

Probabilistic context-free grammars (PCFGs) with neural parameterization have been shown to be effective in unsupervised phrase-structure grammar induction. However, due to the cubic computational complexity of PCFG representation and…

Computation and Language · Computer Science 2021-04-29 Songlin Yang , Yanpeng Zhao , Kewei Tu

The inside-outside probabilities are typically used for reestimating Probabilistic Context Free Grammars (PCFGs), just as the forward-backward probabilities are typically used for reestimating HMMs. I show several novel uses, including…

cmp-lg · Computer Science 2007-05-23 Joshua Goodman

A new method for constructing minimum-redundancy binary prefix codes is described. Our method does not explicitly build a Huffman tree; instead it uses a property of optimal prefix codes to compute the codeword lengths corresponding to the…

Data Structures and Algorithms · Computer Science 2016-09-30 Ahmed Belal , Amr Elmasry

We present a new recursive generation algorithm for prefix normal words. These are binary strings with the property that no substring has more 1s than the prefix of the same length. The new algorithm uses two operations on binary strings,…

Data Structures and Algorithms · Computer Science 2024-04-16 Ferdinando Cicalese , Zsuzsanna Lipták , Massimiliano Rossi

Probabilistic context-free grammars (PCFGs) are used to define distributions over strings, and are powerful modelling tools in a number of areas, including natural language processing, software engineering, model checking, bio-informatics,…

Formal Languages and Automata Theory · Computer Science 2014-07-08 Colin de la Higuera , James Scicluna , Mark-Jan Nederhof

Prefix parsing asks whether an input prefix can be extended to a complete string generated by a given grammar. In the weighted setting, it also provides prefix probabilities, which are central to context-free language modeling,…

Computation and Language · Computer Science 2026-05-05 Clemente Pasti , Andreas Opedal , Timothy J. O'Donnell , Ryan Cotterell , Tim Vieira

Although real-world text datasets, such as DNA sequences, are far from being uniformly random, average-case string searching algorithms perform significantly better than worst-case ones in most applications of interest. In this paper, we…

Data Structures and Algorithms · Computer Science 2018-01-16 Lorraine A. K. Ayad , Panagiotis Charalampopoulos , Costas S. Iliopoulos , Solon P. Pissis

For a partial word $w$ the longest common compatible prefix of two positions $i,j$, denoted $lccp(i,j)$, is the largest $k$ such that $w[i,i+k-1]\uparrow w[j,j+k-1]$, where $\uparrow$ is the compatibility relation of partial words (it is…

We study the problem of computing the probability that a given stochastic context-free grammar (SCFG), G, generates a string in a given regular language L(D) (given by a DFA, D). This basic problem has a number of applications in…

Formal Languages and Automata Theory · Computer Science 2013-02-27 Kousha Etessami , Alistair Stewart , Mihalis Yannakakis

This paper provides a reference description, in the form of a deduction system, of Earley's (1970) context-free parsing algorithm with various speed-ups. Our presentation includes a known worst-case runtime improvement from Earley's $O…

Computation and Language · Computer Science 2023-07-07 Andreas Opedal , Ran Zmigrod , Tim Vieira , Ryan Cotterell , Jason Eisner

Huffman coding finds an optimal prefix code for a given probability mass function. Consider situations in which one wishes to find an optimal code with the restriction that all codewords have lengths that lie in a user-specified set of…

Information Theory · Computer Science 2008-01-03 Michael B. Baer

Many search engines such as Google, Bing & Yahoo! show search suggestions when users enter search phrases on their interfaces. These suggestions are meant to assist the user in finding what she wants quickly and also suggesting common…

Data Structures and Algorithms · Computer Science 2021-11-01 Dhruv Matani

We present an algorithm for computing n-gram probabilities from stochastic context-free grammars, a procedure that can alleviate some of the standard problems associated with n-grams (estimation from sparse data, lack of linguistic…

cmp-lg · Computer Science 2022-02-28 Andreas Stolcke , Jonathan Segal

Language models for speech recognition typically use a probability model of the form Pr(a_n | a_1, a_2, ..., a_{n-1}). Stochastic grammars, on the other hand, are typically used to assign structure to utterances. A language model of the…

Computation and Language · Computer Science 2007-05-23 Mark-Jan Nederhof , Anoop Sarkar , Giorgio Satta

In 1975, Valiant showed that Boolean matrix multiplication can be used for parsing context-free grammars (CFGs), yielding the asympotically fastest (although not practical) CFG parsing algorithm known. We prove a dual result: any CFG parser…

Computation and Language · Computer Science 2007-05-23 Lillian Lee

We provide the first fully polynomial-time randomized approximation scheme for the following two counting problems: 1. Given a Context Free Grammar $G$ over alphabet $\Sigma$, count the number of words of length exactly $n$ generated by…

Data Structures and Algorithms · Computer Science 2026-05-18 Kuldeep S. Meel , Alexis de Colnet
‹ Prev 1 2 3 10 Next ›