Related papers: Pangenome-guided sequence assembly via binary opti…
Assembling genomes from short-read sequencing data remains difficult in repetitive regions, where reference bias and combinatorial complexity limit existing methods. Pangenome-guided sequence assembly (PGSA) mitigates reference bias by…
Genome sequencing is essential to decode genetic information, identify organisms, understand diseases and advance personalized medicine. A critical step in any genome sequencing technique is genome assembly. However, de novo genome…
Affordable, high-quality whole-genome assemblies have made it possible to construct rich pangenomes that capture haplotype diversity across many species. As these datasets grow, they motivate the development of specialized techniques…
De novo genome assembly focuses on finding connections between a vast amount of short sequences in order to reconstruct the original genome. The central problem of genome assembly could be described as finding a Hamiltonian path through a…
Recent emergence of next-generation DNA sequencing technology has enabled acquisition of genetic information at unprecedented scales. In order to determine the genetic blueprint of an organism, sequencing platforms typically employ…
Recent advances in DNA sequencing open prospects to make whole-genome analysis rapid and reliable, which is promising for various applications including personalized medicine. However, existing techniques for {\it de novo} genome assembly,…
Genome assembly using high throughput data with short reads, arguably, remains an unresolvable task in repetitive genomes, since when the length of a repeat exceeds the read length, it becomes difficult to unambiguously connect the flanking…
The recent advances in sequencing technologies enables the assembly of individual genomes to the reference quality. How to integrate multiple genomes from the same species and to make the integrated representation accessible to biologists…
De novo genome assembly is the process of stitching short DNA sequences to generate longer DNA sequences, without using any reference sequence for alignment. It enables high-throughput genome sequencing and thus accelerates the discovery of…
Computational Pangenomics is an emerging field that studies genetic variation using a graph structure encompassing multiple genomes. Visualizing pangenome graphs is vital for understanding genome diversity. Yet, handling large graphs can be…
Reducing the cost of sequencing genomes provided by next-generation sequencing technologies has greatly increased the number of genomic projects. As a result, there is a growing need for better assembly and assembly validation methods. One…
High read depth can be used to assemble short sequence repeats. The existing genome assemblers fail in repetitive regions of longer than average read. I propose a new algorithm for a DNA assembly which uses the relative frequency of reads…
Genome sequence analysis plays a pivotal role in enabling many medical and scientific advancements in personalized medicine, outbreak tracing, and forensics. However, the analysis of genome sequencing data is currently bottlenecked by the…
De novo genome assembly, i.e., rebuilding the sequence of an unknown genome from redundant and erroneous short sequences, is a key but computationally intensive step in many genomics pipelines. The exponential growth of genomic data is…
de Bruijn graph-based algorithms are one of the two most widely used approaches for de novo genome assembly. A major limitation of this approach is the large computational memory space requirement to construct the de Bruijn graph, which…
Genome assembly from the high-throughput sequencing (HTS) reads is a fundamental yet challenging computational problem. An intrinsic challenge is the uncertainty caused by the widespread repetitive elements. Here we get around the…
With small-scale quantum processors transitioning from experimental physics labs to industrial products, these processors allow us to efficiently compute important algorithms in various fields. In this paper, we propose a quantum algorithm…
(An updated version of this manuscript has been accepted to Scientific Reports in 2016, please refer to http://www.nature.com/articles/srep31900) The highly anticipated transition from next generation sequencing (NGS) to third generation…
The de novo assembly of large, complex genomes is a significant challenge with currently available DNA sequencing technology. While many de novo assembly software packages are available, comparatively little attention has been paid to…
The formal version of our work has been published in BMC Bioinformatics and can be found here: http://www.biomedcentral.com/1471-2105/13/S6/S1 Motivation: To tackle the problem of huge memory usage associated with de Bruijn graph-based…