Related papers: DNA Segmentation as A Model Selection Process
We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian Information Criterion (BIC) in the model selection framework. When…
This paper presents a novel method to segment/decode DNA sequences based on n-grams statistical language model. Firstly, we find the length of most DNA 'words' is 12 to 15 bps by analyzing the genomes of 12 model species. Then we design an…
Systems that are based on recursive Bayesian updates for classification limit the cost of evidence collection through certain stopping/termination criteria and accordingly enforce decision making. Conventionally, two termination criteria…
Identification of functional elements of a genome often requires dividing a sequence of measurements along a genome into segments differing from adjacent segments. In many applications, the mean of the measured values at multiple genomic…
Model selection consists in comparing several candidate models according to a metric to be optimized. The process often involves a grid search, or such, and cross-validation, which can be time consuming, as well as not providing much…
Segmentation, or the outlining of objects within images, is a critical step in the measurement and analysis of cells within microscopy images. While improvements continue to be made in tools that rely on classical methods for segmentation,…
We introduce a novel method to analyse complete genomes and recognise some distinctive features by means of an adaptive compression algorithm, which is not DNA-oriented. We study the Information Content as a function of the number of…
Approximate Bayesian computation (ABC) is a class of algorithmic methods in Bayesian inference using statistical summaries and computer simulations. ABC has become popular in evolutionary genetics and in other branches of biology. However…
The vast amount of biological knowledge accumulated over the years has allowed researchers to identify various biochemical interactions and define different families of pathways. There is an increased interest in identifying pathways and…
Image segmentation is a long-studied and important problem in image processing. Different solutions have been proposed, many of which follow the information theoretic paradigm. While these information theoretic segmentation methods often…
Document segmentation is a method of rending the document into distinct regions. A document is an assortment of information and a standard mode of conveying information to others. Pursuance of data from documents involves ton of human…
DNA sequencing is the basic workhorse of modern day biology and medicine. Shotgun sequencing is the dominant technique used: many randomly located short fragments called reads are extracted from the DNA sequence, and these reads are…
Text retrieval systems often return large sets of documents, particularly when applied to large collections. Stopping criteria can reduce the number of these documents that need to be manually evaluated for relevance by predicting when a…
Deep learning-based segmentation and classification are crucial to large-scale biomedical imaging, particularly for 3D data, where manual analysis is impractical. Although many methods exist, selecting suitable models and tuning parameters…
Today Bayesian networks are more used in many areas of decision support and image processing. In this way, our proposed approach uses Bayesian Network to modelize the segmented image quality. This quality is calculated on a set of…
Motivation: Array Comparative Genomic Hybridization (aCGH) is used to scan the entire genome for variations in DNA copy number. A central task in the analysis of aCGH data is the segmentation into groups of probes sharing the same DNA copy…
At the core of high throughput DNA sequencing platforms lies a bio-physical surface process that results in a random geometry of clusters of homogenous short DNA fragments typically hundreds of base pairs long - bridge amplification. The…
Motivated by empirical observations of algebraic duplicated sequence length distributions in a broad range of natural genomes, we analytically formulate and solve a class of simple discrete duplication/substitution models that generate…
We live in a period where bio-informatics is rapidly expanding, a significant quantity of genomic data has been produced as a result of the advancement of high-throughput genome sequencing technology, raising concerns about the costs…
Image segmentation is a critical step in computer vision tasks constituting an essential issue for pattern recognition and visual interpretation. In this paper, we propose a new stopping criterion for the mean shift iterative algorithm by…