English
Related papers

Related papers: Context tree selection and linguistic rhythm retri…

200 papers

We study a variable length Markov chain model associated with a group of stationary processes that share the same context tree but each process has potentially different conditional probabilities. We propose a new model selection and…

Methodology · Statistics 2016-01-01 Alexandre Belloni , Roberto I. Oliveira

Speech technologies rely on capturing a speaker's voice variability while obtaining comprehensive language information. Textual prompts and sentence selection methods have been proposed in the literature to comprise such adequate phonetic…

Computation and Language · Computer Science 2024-02-09 Marcellus Amadeus , William Alberto Cruz Castañeda , Wilmer Lobato , Niasche Aquino

Languages have long been described according to their perceived rhythmic attributes. The associated typologies are of interest in psycholinguistics as they partly predict newborns' abilities to discriminate between languages and provide…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-29 François Deloche , Laurent Bonnasse-Gahot , Judit Gervain

This paper presents a model-based, unsupervised algorithm for recovering word boundaries in a natural-language text from which they have been deleted. The algorithm is derived from a probability model of the source that generated the text.…

Computation and Language · Computer Science 2007-05-23 Michael R. Brent

The present paper investigates non-asymptotic properties of two popular procedures of context tree (or Variable Length Markov Chains) estimation: Rissanen's algorithm Context and the Penalized Maximum Likelihood criterion. First showing how…

Statistics Theory · Mathematics 2011-06-30 Aurélien Garivier , Florencia Leonardi

We study a problem of model selection for data produced by two different context tree sources. Motivated by linguistic questions, we consider the case where the probabilistic context trees corresponding to the two sources are finite and…

Statistics Theory · Mathematics 2013-08-12 Antonio Galves , Aurélien Garivier , Elisabeth Gassiat

Measuring similarities between strings is central for many established and fast growing research areas including information retrieval, biology, and natural language processing. The traditional approach for string similarity measurements is…

Information Retrieval · Computer Science 2018-08-20 Mehdi Ben Lazreg , Morten Goodwin

We consider the problem of estimating the context tree of a stationary ergodic process with finite alphabet without imposing additional conditions on the process. As a starting point we introduce a Hamming metric in the space of irreducible…

Statistics Theory · Mathematics 2015-08-21 Sandro Gallo , Florencia Leonardi

Concepts of complex networks have been used to obtain metrics that were correlated to text quality established by scores assigned by human judges. Texts produced by high-school students in Portuguese were represented as scale-free networks…

Physics and Society · Physics 2009-11-11 Lucas Antiqueira , Maria das Gracas V. Nunes , Osvaldo N. Oliveira , Luciano da F. Costa

Rhetoric, both spoken and written, involves not only content but also style. One common stylistic tool is $\textit{parallelism}$: the juxtaposition of phrases which have the same sequence of linguistic ($\textit{e.g.}$, phonological,…

Computation and Language · Computer Science 2023-12-04 Stephen Bothwell , Justin DeBenedetto , Theresa Crnkovich , Hildegund Müller , David Chiang

Many efficient algorithms with strong theoretical guarantees have been proposed for the contextual multi-armed bandit problem. However, applying these algorithms in practice can be difficult because they require domain expertise to build…

Machine Learning · Computer Science 2018-10-23 Adam N. Elmachtoub , Ryan McNellis , Sechan Oh , Marek Petrik

In natural speech, the speaker does not pause between words, yet a human listener somehow perceives this continuous stream of phonemes as a series of distinct words. The detection of boundaries between spoken words is an instance of a…

Computation and Language · Computer Science 2011-06-28 Jerry R. Van Aken

With the constant growth of the World Wide Web and the number of documents in different languages accordingly, the need for reliable language detection tools has increased as well. Platforms such as Twitter with predominantly short texts…

Computation and Language · Computer Science 2016-08-31 Ivana Balazevic , Mikio Braun , Klaus-Robert Müller

The objectives of this work are cross-modal text-audio and audio-text retrieval, in which the goal is to retrieve the audio content from a pool of candidates that best matches a given written description and vice versa. Text-audio retrieval…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-11 A. Sophia Koepke , Andreea-Maria Oncescu , João F. Henriques , Zeynep Akata , Samuel Albanie

We address the issue of context tree estimation in variable length hidden Markov models. We propose an estimator of the context tree of the hidden Markov process which needs no prior upper bound on the depth of the context tree. We prove…

Information Theory · Computer Science 2011-09-15 Thierry Dumont

Tokenising continuous speech into sequences of discrete tokens and modelling them with language models (LMs) has led to significant success in text-to-speech (TTS) synthesis. Although these models can generate speech with high quality and…

Sound · Computer Science 2024-08-30 Zehai Tu , Guangyan Zhang , Yiting Lu , Adaeze Adigwe , Simon King , Yiwen Guo

There is extensive interest in metric learning methods for image retrieval. Many metric learning loss functions focus on learning a correct ranking of training samples, but strongly overfit semantically inconsistent labels and require a…

Machine Learning · Computer Science 2023-06-05 Christopher Liao , Theodoros Tsiligkaridis , Brian Kulis

The quota system in Brazil made it possible to include blind students in higher education. Teachers' lack of knowledge about the braille system can represent a barrier between them and students who use it for writing and reading.…

Computer Vision and Pattern Recognition · Computer Science 2021-03-09 André Roberto Ortoncelli , Marlon Marcon , Franciele Beal

The context tree source is a source model in which the occurrence probability of symbols is determined from a finite past sequence, and is a broader class of sources that includes i.i.d. and Markov sources. The proposed source model in this…

Information Theory · Computer Science 2021-05-14 Koshi Shimada , Shota Saito , Toshiyasu Matsushima

Document retrieval is one of the best established information retrieval activities since the sixties, pervading all search engines. Its aim is to obtain, from a collection of text documents, those most relevant to a pattern query. Current…

Information Retrieval · Computer Science 2013-10-01 Gonzalo Navarro
‹ Prev 1 2 3 10 Next ›