Related papers: DNA Sequence Evolution with Neighbor-Dependent Mut…
Methylation of CpG dinucleotides is a prevalent epigenetic modification that is required for proper development in vertebrates, and changes in CpG methylation are essential to cellular differentiation. Genome-wide DNA methylation assays…
The presence of neighbor dependencies generated a specific pattern of dinucleotide frequencies in all organisms. Especially, the CpG-methylation-deamination process is the predominant substitution process in vertebrates and needs to be…
The functions of proteins and RNAs are determined by a myriad of interactions between their constituent residues, but most quantitative models of how molecular phenotype depends on genotype must approximate this by simple additive effects.…
We develop statistically based methods to detect single nucleotide DNA mutations in next generation sequencing data. Sequencing generates counts of the number of times each base was observed at hundreds of thousands to billions of genome…
Much of the on-going statistical analysis of DNA sequences is focused on the estimation of characteristics of coding and non-coding regions that would possibly allow discrimination of these regions. In the current approach, we concentrate…
DNA methylation is an epigenetic mechanism whose important role in development has been widely recognized. This epigenetic modification results in heritable changes in gene expression not encoded by the DNA sequence. The underlying…
We study theoretically the in vitro evolution of a DNA sequence by binding to a transcription factor. Using a simple model of protein-DNA binding and available binding constants for the Mnt protein, we perform large-scale, realistic…
An inhomogeneous helicoidal nearest-neighbor model with continuous degrees of freedom is shown to predict the same DNA melting properties as traditional long-range Ising models, for free DNA molecules in solution, as well as superhelically…
DNA Methylation has been the most extensively studied epigenetic mark. Usually a change in the genotype, DNA sequence, leads to a change in the phenotype, observable characteristics of the individual. But DNA methylation, which happens in…
In this paper we treat some fractal and statistical features of the DNA sequences. First, a fractal record model of DNA sequence is proposed by mapping DNA sequences to integer sequences, followed by R/S analysis of the model and…
In this paper we study the evolution of the mutation rate for simple organisms in dynamic environments. A model with multiple fitness coding loci tracking a moving fitness peak is developed and an analytical expression for the optimal…
We introduce a family of models incorporating random segmental substitutions and point mutations and demonstrate that such models reproduce algebraic length distributions of exact matches with the slope $-4$ observed earlier in pairwise…
Recent advances in next-generation sequencing technologies have facilitated the use of deoxyribonucleic acid (DNA) as a novel covert channels in steganography. There are various methods that exist in other domains to detect hidden messages…
Naturally evolving proteins gradually accumulate mutations while continuing to fold to thermodynamically stable native structures. This process of neutral protein evolution is an important mode of genetic change, and forms the basis for the…
We consider the task of detecting regulatory elements in the human genome directly from raw DNA. Past work has focused on small snippets of DNA, making it difficult to model long-distance dependencies that arise from DNA's 3-dimensional…
Non protein coding regions of the human genome contain many complex patterns which regulate the cellular activity. Studying the human genome is limited by the lack of understanding of its features and their complex interactions. However,…
Background: Recent assays for individual-specific genome-wide DNA methylation profiles have enabled epigenome-wide association studies to identify specific CpG sites associated with a phenotype. Computational prediction of CpG site-specific…
Motivation: Most existing methods for DNA sequence analysis rely on accurate sequences or genotypes. However, in applications of the next-generation sequencing (NGS), accurate genotypes may not be easily obtained (e.g. multi-sample…
Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of…
Sequencing by synthesis is used in many next-generation DNA sequencing technologies. Some of the technologies, especially those exploring the principle of single-molecule sequencing, allow incomplete nucleotide incorporation in each cycle.…