English
Related papers

Related papers: Phylogenetic distances for neighbour dependent sub…

200 papers

The presence of neighbor dependencies generated a specific pattern of dinucleotide frequencies in all organisms. Especially, the CpG-methylation-deamination process is the predominant substitution process in vertebrates and needs to be…

Genomics · Quantitative Biology 2007-05-23 Peter F. Arndt , Terence Hwa

We prove that a wide class of models of Markov neighbor-dependent substitution processes on the integer line is solvable. This class contains some models of nucleotide substitutions recently introduced and studied empirically by molecular…

Probability · Mathematics 2007-05-23 Jean Bérard , Jean-Baptiste Gouéré , Didier Piau

We propose a new distance metric for DNA sequences, which can be defined on any evolutionary Markov model with infinitesimal generator matrix Q. That is the new metric can be defined under existing models such as Jukes-Cantor model,…

Populations and Evolution · Quantitative Biology 2009-02-12 Viswanath. C. Narayanan

We introduce a model of DNA sequence evolution which can account for biases in mutation rates that depend on the identity of the neighboring bases. An analytic solution for this class of non-equilibrium models is developed by adopting…

Biological Physics · Physics 2007-05-23 Peter F. Arndt , Christopher B. Burge , Terence Hwa

In this paper, we apply conformal prediction to time series data. Conformal prediction isa method that produces predictive regions given a confidence level. The regions outputs arealways valid under the exchangeability assumption. However,…

Methodology · Statistics 2021-10-26 Samya Tajmouati , Bouazza El Wahbi , Mohammed Dakkoun

This paper addresses the estimation of locally stationary long-range dependent processes, a methodology that allows the statistical analysis of time series data exhibiting both nonstationarity and strong dependency. A time-varying…

Statistics Theory · Mathematics 2010-11-12 Wilfredo Palma , Ricardo Olea

Distances between sequences based on their $k$-mer frequency counts can be used to reconstruct phylogenies without first computing a sequence alignment. Past work has shown that effective use of k-mer methods depends on 1) model-based…

Populations and Evolution · Quantitative Biology 2017-05-22 Chris Durden , Seth Sullivant

Accurate estimation of evolutionary distances between taxa is important for many phylogenetic reconstruction methods. In the case of bacteria, distances can be estimated using a range of different evolutionary models, from single nucleotide…

Populations and Evolution · Quantitative Biology 2017-04-17 Stuart Serdoz , Attila Egri-Nagy , Jeremy Sumner , Barbara R. Holland , Peter D. Jarvis , Mark M. Tanaka , Andrew R. Francis

This article proposes a novel approach to statistical alignment of nucleotide sequences by introducing a context dependent structure on the substitution process in the underlying evolutionary model. We propose to estimate alignments and…

Statistics Theory · Mathematics 2011-07-18 Ana Arribas-Gil , Catherine Matias

We consider the problem of distance estimation under the TKF91 model of sequence evolution by insertions, deletions and substitutions on a phylogeny. In an asymptotic regime where the expected sequence lengths tend to infinity, we show that…

Probability · Mathematics 2020-10-29 Wai-Tong Louis Fan , Brandon Legried , Sebastien Roch

Modelling the substitution of nucleotides along a phylogenetic tree is usually done by a hidden Markov process. This allows to define a distribution of characters at the leaves of the trees and one might be able to obtain polynomial…

Populations and Evolution · Quantitative Biology 2020-10-12 Marta Casanellas , Jesús Fernández-Sánchez , Marina Garrote-López

This paper proposes an extension to conventional regression Neural Networks (NNs) for replacing the point predictions they produce with prediction intervals that satisfy a required level of confidence. Our approach follows a novel machine…

Machine Learning · Computer Science 2023-12-18 Harris Papadopoulos , Haris Haralambous

Various approaches to alignment-free sequence comparison are based on the length of exact or inexact word matches between two input sequences. Haubold {\em et al.} (2009) showed how the average number of substitutions between two DNA…

Populations and Evolution · Quantitative Biology 2017-09-06 Burkhard Morgenstern , Svenja Schöbel , Chris-André Leimeister

We define two minimum distance estimators for dependent data by minimizing some approximated Maximum Mean Discrepancy distances between the true empirical distribution of observations and their assumed (parametric) model distribution. When…

Methodology · Statistics 2026-01-19 Pierre Alquier , Jean-David Fermanian , Benjamin Poignard

Let $(X_i)_{i=1,...,n}$ be a possibly nonstationary sequence such that $\mathscr{L}(X_i)=P_n$ if $i\leq n\theta$ and $\mathscr{L}(X_i)=Q_n$ if $i>n\theta$, where $0<\theta <1$ is the location of the change-point to be estimated. We…

Statistics Theory · Mathematics 2009-09-29 Samir Ben Hariz , Jonathan J. Wylie , Qiang Zhang

We study the problem of estimating the mutation rate between two sequences from noisy sequencing reads. Existing alignment-free methods typically assume direct access to the full sequences. We extend these methods to the sequencing…

Information Theory · Computer Science 2026-01-13 Shiv Pratap Singh Rathore , Navin Kashyap

Quantifying uncertainty in automatically generated text is important for letting humans check potential hallucinations and making systems more reliable. Conformal prediction is an attractive framework to provide predictions imbued with…

Computation and Language · Computer Science 2024-02-02 Dennis Ulmer , Chrysoula Zerva , André F. T. Martins

Inferring the phylogenetic relationships among a sample of organisms is a fundamental problem in modern biology. While distance-based hierarchical clustering algorithms achieved early success on this task, these have been supplanted by…

Machine Learning · Computer Science 2025-12-03 Benjamin K. Rosenzweig , Matthew W. Hahn

Pathogen genome data offers valuable structure for spatial models, but its utility is limited by incomplete sequencing coverage. We propose a probabilistic framework for inferring genetic distances between unsequenced cases and known…

Genomics · Quantitative Biology 2025-09-10 Haley Stone , Jing Du , Hao Xue , Matthew Scotch , David Heslop , Andreas Züfle , Chandini Raina MacIntyre , Flora Salim

We extend in two directions our previous results about the sampling and the empirical measures of immortal branching Markov processes. Direct applications to molecular biology are rigorous estimates of the mutation rates of polymerase chain…

Probability · Mathematics 2007-05-23 Didier Piau
‹ Prev 1 2 3 10 Next ›