English
Related papers

Related papers: Succinct Data Structures for Assembling Large Geno…

200 papers

Massively parallel DNA sequencing technologies are revolutionizing genomics research. Billions of short reads generated at low costs can be assembled for reconstructing the whole genomes. Unfortunately, the large memory footprint of the…

Data Structures and Algorithms · Computer Science 2012-07-17 Yang Li , Pegah Kamousi , Fangqiu Han , Shengqi Yang , Xifeng Yan , Subhash Suri

de Bruijn graph-based algorithms are one of the two most widely used approaches for de novo genome assembly. A major limitation of this approach is the large computational memory space requirement to construct the de Bruijn graph, which…

Data Structures and Algorithms · Computer Science 2011-07-11 Chengxi Ye , Zhanshan Sam Ma , Charles H. Cannon , Mihai Pop , Douglas W. Yu

Deep sequencing has enabled the investigation of a wide range of environmental microbial ecosystems, but the high memory requirements for {\em de novo} assembly of short-read shotgun sequencing data from these complex populations are an…

Genomics · Quantitative Biology 2015-06-03 Jason Pell , Arend Hintze , Rosangela Canino-Koning , Adina Howe , James M. Tiedje , C. Titus Brown

The de Bruijn graph plays an important role in bioinformatics, especially in the context of de novo assembly. However, the representation of the de Bruijn graph in memory is a computational bottleneck for many assemblers. Recent papers…

Quantitative Methods · Quantitative Biology 2014-10-07 Rayan Chikhi , Antoine Limasset , Shaun Jackman , Jared Simpson , Paul Medvedev

The formal version of our work has been published in BMC Bioinformatics and can be found here: http://www.biomedcentral.com/1471-2105/13/S6/S1 Motivation: To tackle the problem of huge memory usage associated with de Bruijn graph-based…

Data Structures and Algorithms · Computer Science 2013-01-10 Chengxi Ye , Charles H. Cannon , Zhanshan Sam Ma , Douglas W. Yu , Mihai Pop

De novo genome assembly focuses on finding connections between a vast amount of short sequences in order to reconstruct the original genome. The central problem of genome assembly could be described as finding a Hamiltonian path through a…

Machine Learning · Computer Science 2020-11-11 Lovro Vrček , Petar Veličković , Mile Šikić

The merging of succinct data structures is a well established technique for the space efficient construction of large succinct indexes. In the first part of the paper we propose a new algorithm for merging succinct representations of de…

Data Structures and Algorithms · Computer Science 2021-07-13 Lavinia Egidi , Felipe A. Louza , Giovanni Manzini

De Bruijn graph is one of the most important data structures used in de-novo genome assembly algorithms, especially for NGS data. There is a growing need for parallel data structures and algorithms due to the increasing number of cores in…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-08 Daniel Górniak , Robert Nowak

Converting a set of sequencing reads into a lossless compact data structure that encodes all the relevant biological information is a major challenge. The classical approaches are to build the string graph or the de Bruijn graph. Each has…

Data Structures and Algorithms · Computer Science 2019-12-02 Diego Díaz-Domínguez , Travis Gagie , Gonzalo Navarro

Large biological datasets are being produced at a rapid pace and create substantial storage challenges, particularly in the domain of high-throughput sequencing (HTS). Most approaches currently used to store HTS data are either unable to…

Quantitative Methods · Quantitative Biology 2014-03-05 Fabien Campagne , Kevin C. Dorff , Nyasha Chambwe , James T. Robinson , Jill P. Mesirov , Thomas D. Wu

The first step in any genome assembly algorithm entails the conversion from the domain of strings and overlaps to the language of graphs and paths, typically using one of the two conventional methods: de Bruijn graphs or overlap graphs.…

Genomics · Quantitative Biology 2026-04-27 Anton Bankevich

We present Quip, a lossless compression algorithm for next-generation sequencing data in the FASTQ and SAM/BAM formats. In addition to implementing reference-based compression, we have developed, to our knowledge, the first assembly-based…

Quantitative Methods · Quantitative Biology 2012-07-11 Daniel C. Jones , Walter L. Ruzzo , Xinxia Peng , Michael G. Katze

(An updated version of this manuscript has been accepted to Scientific Reports in 2016, please refer to http://www.nature.com/articles/srep31900) The highly anticipated transition from next generation sequencing (NGS) to third generation…

Genomics · Quantitative Biology 2016-09-06 Chengxi Ye , Chris Hill , Shigang Wu , Jue Ruan , Zhanshan , Ma

Motivation: Data volumes generated by next-generation sequencing technolo- gies is now a major concern, both for storage and transmission. This triggered the need for more efficient methods than general purpose compression tools, such as…

Data Structures and Algorithms · Computer Science 2014-12-19 Gaëtan Benoit , Claire Lemaitre , Dominique Lavenier , Guillaume Rizk

One of the most computationally intensive tasks in computational biology is de novo genome assembly, the decoding of the sequence of an unknown genome from redundant and erroneous short sequences. A common assembly paradigm identifies…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-21 Giulia Guidi , Oguz Selvitopi , Marquita Ellis , Leonid Oliker , Katherine Yelick , Aydin Buluc

Motivation: De Bruijn graphs have been proposed as a data structure to facilitate the analysis of related whole genome sequences, in both a population and comparative genomic settings. However, current approaches do not scale well to many…

Data Structures and Algorithms · Computer Science 2016-02-19 Ilia Minkin , Son Pham , Paul Medvedev

Genome assembly is a prominent problem studied in bioinformatics, which computes the source string using a set of its overlapping substrings. Classically, genome assembly uses assembly graphs built using this set of substrings to compute…

Data Structures and Algorithms · Computer Science 2024-09-24 Saumya Talera , Parth Bansal , Shabnam Khan , Shahbaz Khan

De novo DNA assembly is a fundamental task in Bioinformatics, and finding Eulerian paths on de Bruijn graphs is one of the dominant approaches to it. In most of the cases, there may be no one order for the de Bruijn graph that works well…

Data Structures and Algorithms · Computer Science 2018-05-15 Diego Díaz-Domínguez , Djamal Belazzougui , Travis Gagie , Veli Mäkinen , Gonzalo Navarro , Simon J. Puglisi

De novo genome assembly is the process of stitching short DNA sequences to generate longer DNA sequences, without using any reference sequence for alignment. It enables high-throughput genome sequencing and thus accelerates the discovery of…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-16 Da Yan , Hongzhi Chen , James Cheng , Zhenkun Cai , Bin Shao

Background - The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly…

Genomics · Quantitative Biology 2015-02-02 Keith R. Bradnam , Joseph N. Fass , Anton Alexandrov , Paul Baranay , Michael Bechner , İnanç Birol , Sébastien Boisvert , Jarrod A. Chapman , Guillaume Chapuis , Rayan Chikhi , Hamidreza Chitsaz , Wen-Chi Chou , Jacques Corbeil , Cristian Del Fabbro , T. Roderick Docking , Richard Durbin , Dent Earl , Scott Emrich , Pavel Fedotov , Nuno A. Fonseca , Ganeshkumar Ganapathy , Richard A. Gibbs , Sante Gnerre , Élénie Godzaridis , Steve Goldstein , Matthias Haimel , Giles Hall , David Haussler , Joseph B. Hiatt , Isaac Y. Ho , Jason Howard , Martin Hunt , Shaun D. Jackman , David B Jaffe , Erich Jarvis , Huaiyang Jiang , Sergey Kazakov , Paul J. Kersey , Jacob O. Kitzman , James R. Knight , Sergey Koren , Tak-Wah Lam , Dominique Lavenier , François Laviolette , Yingrui Li , Zhenyu Li , Binghang Liu , Yue Liu , Ruibang Luo , Iain MacCallum , Matthew D MacManes , Nicolas Maillet , Sergey Melnikov , Bruno Miguel Vieira , Delphine Naquin , Zemin Ning , Thomas D. Otto , Benedict Paten , Octávio S. Paulo , Adam M. Phillippy , Francisco Pina-Martins , Michael Place , Dariusz Przybylski , Xiang Qin , Carson Qu , Filipe J Ribeiro , Stephen Richards , Daniel S. Rokhsar , J. Graham Ruby , Simone Scalabrin , Michael C. Schatz , David C. Schwartz , Alexey Sergushichev , Ted Sharpe , Timothy I. Shaw , Jay Shendure , Yujian Shi , Jared T. Simpson , Henry Song , Fedor Tsarev , Francesco Vezzi , Riccardo Vicedomini , Jun Wang , Kim C. Worley , Shuangye Yin , Siu-Ming Yiu , Jianying Yuan , Guojie Zhang , Hao Zhang , Shiguo Zhou , Ian F. Korf
‹ Prev 1 2 3 10 Next ›