Related papers: Extending the Recursive Jensen-Shannon Segmentatio…
Most successful applications of deep learning involve similar training and test conditions. However, tasks such as biological sequence design involve searching for sequences that improve desirable properties beyond previously known values,…
We propose local segmentation of multiple sequences sharing a common time- or location-index, building upon the single sequence local segmentation methods of Niu and Zhang (2012) and Fang, Li and Siegmund (2016). We also propose reverse…
The Jensen-Shannon divergence has been successfully applied as a segmentation tool for symbolic sequences, that is to separate the sequence into subsequences with the same symbolic content. In this work, we propose a method, based on the…
We introduce segmental recurrent neural networks (SRNNs) which define, given an input sequence, a joint probability distribution over segmentations of the input and labelings of the segments. Representations of the input segments (i.e.,…
By using the Jensen-Shannon divergence, genomic DNA can be divided into compositionally distinct domains through a standard recursive segmentation procedure. Each domain, while significantly different from its neighbours, may however share…
The Jensen-Shannon divergence is a renown bounded symmetrization of the Kullback-Leibler divergence which does not require probability densities to have matching supports. In this paper, we introduce a vector-skew generalization of the…
We propose an adaptive estimator for the stationary distribution of a bifurcating Markov Chain on $\mathbb R^d$. Bifurcating Markov chains (BMC for short) are a class of stochastic processes indexed by regular binary trees. A kernel…
Previous divide-and-conquer segmentation analyses of DNA sequences do not provide a satisfactory stopping criterion for the recursion. This paper proposes that segmentation be considered as a model selection process. Using the tools in…
In this paper, we describe the context sensitivity problem encountered in partitioning a heterogeneous biological sequence into statistically homogeneous segments. After showing signatures of the problem in the bacterial genomes of…
Computational methods for discovering patterns of local correlations in sequences are important in computational biology. Here we show how to determine the optimal partitioning of aligned sequences into non-overlapping segments such that…
We consider a class of small-sample distribution estimators over noisy channels. Our estimators are designed for repetition channels, and rely on properties of the runs of the observed sequences. These runs are modeled via a special type of…
This paper proposes a novel learning method for a mixture of recurrent neural network (RNN) experts model, which can acquire the ability to generate desired sequences by dynamically switching between experts. Our method is based on maximum…
We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian Information Criterion (BIC) in the model selection framework. When…
Microbial clades modeling is a challenging problem in biology based on microarray genome sequences, especially in new species gene isolates discovery and category. Marker family genome sequences play important roles in describing specific…
The design of biological systems is hindered by uncertainty arising from both intrinsic stochasticity of biomolecular reactions and variability across laboratory or experimental conditions. In this work, we present a sequential framework to…
We consider the problem of estimating the asymptotic variance of a function defined on a Markov chain, an important step for statistical inference of the stationary mean. We design a novel recursive estimator that requires $O(1)$…
This paper proposes a new type of recurrence where we divide the Markov chains into intervals that start when the chain enters into a subset A, then sample another subset B far away from A and end when the chain again return to A. The…
Most of the semantic segmentation approaches have been developed for single image segmentation, and hence, video sequences are currently segmented by processing each frame of the video sequence separately. The disadvantage of this is that…
We consider the minimization of an objective function given access to unbiased estimates of its gradient through stochastic gradient descent (SGD) with constant step-size. While the detailed analysis was only performed for quadratic…
This paper considers the problem of variable-length coding over a discrete memoryless channel (DMC) with noiseless feedback. The paper provides a stochastic control view of the problem whose solution is analyzed via a newly proposed…