English
Related papers

Related papers: AMGC: Adaptive match-based genomic compression alg…

200 papers

Storing and archiving data produced by next-generation sequencing (NGS) is a huge burden for research institutions. Reference-based compression algorithms are effective in dealing with these data. Our work focuses on compressing FASTQ…

Information Theory · Computer Science 2024-04-04 Yuanjian Liu , Huihao Luo , Zhijun Han , Yao Hu , Yehui Yang , Kyle Chard , Sheng Di , Ian Foster , Jiesheng Wu

Motivation: Next Generation Sequencing technologies revolutionized many fields in biology by enabling the fast and cheap sequencing of large amounts of genomic data. The ever increasing sequencing capacities enabled by current sequencing…

Genomics · Quantitative Biology 2012-07-24 Himanshu Asnani , Dinesh Bharadia , Mainak Chowdhury , Idoia Ochoa , Itai Sharon , Tsachy Weissman

(An updated version of this manuscript has been accepted to Scientific Reports in 2016, please refer to http://www.nature.com/articles/srep31900) The highly anticipated transition from next generation sequencing (NGS) to third generation…

Genomics · Quantitative Biology 2016-09-06 Chengxi Ye , Chris Hill , Shigang Wu , Jue Ruan , Zhanshan , Ma

Background: Identifying all possible mapping locations of next-generation sequencing (NGS) reads is highly essential in several applications such as prediction of genomic variants or protein binding motifs located in repeat regions, isoform…

Genomics · Quantitative Biology 2020-03-25 Ngoc Hieu Tran , Xin Chen

We propose a new compression scheme for genomic data given as sequence fragments called reads. The scheme uses a reference genome at the decoder side only, freeing the encoder from the burdens of storing references and performing…

Information Theory · Computer Science 2023-02-10 Yotam Gershon , Yuval Cassuto

Genome sequence analysis, which examines the DNA sequences of organisms, drives advances in many critical medical and biotechnological fields. Given its importance and the exponentially growing volumes of genomic sequence data, there are…

We present a Compression Tool, "GenBit Compress", for genetic sequences based on our new proposed "GenBit Compress Algorithm". Our Tool achieves the best compression ratios for Entire Genome (DNA sequences) . Significantly better…

Mathematical Software · Computer Science 2010-07-15 P. Raja Rajeswari , Allam Apparo , V. K. Kumar

Technology progress in DNA sequencing boosts the genomic database growth at faster and faster rate. Compression, accompanied with random access capabilities, is the key to maintain those huge amounts of data. In this paper we present an…

Computational Engineering, Finance, and Science · Computer Science 2011-03-14 Szymon Grabowski , Sebastian Deorowicz

We live in a period where bio-informatics is rapidly expanding, a significant quantity of genomic data has been produced as a result of the advancement of high-throughput genome sequencing technology, raising concerns about the costs…

Quantitative Methods · Quantitative Biology 2023-03-10 Mehedi Hasan Sarkar , Adnan Ferdous Ashrafi

The growing popularity of Large Language Models has sparked interest in context compression for Large Language Models (LLMs). However, the performance of previous methods degrades dramatically as compression ratios increase, sometimes even…

Computation and Language · Computer Science 2024-06-18 Zhiwei Cao , Qian Cao , Yu Lu , Ningxin Peng , Luyang Huang , Shanbo Cheng , Jinsong Su

DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while…

Information Theory · Computer Science 2016-11-15 Bobbie Chern , Idoia Ochoa , Alexandros Manolakos , Albert No , Kartik Venkat , Tsachy Weissman

Relative compression, where a set of similar strings are compressed with respect to a reference string, is a very effective method of compressing DNA datasets containing multiple similar sequences. Relative compression is fast to perform…

Quantitative Methods · Quantitative Biology 2011-06-21 Shanika Kuruppu , Simon Puglisi , Justin Zobel

Genome sequence analysis is a powerful tool in medical and scientific research. Considering the inevitable sequencing errors and genetic variations, approximate string matching (ASM) has been adopted in practice for genome sequencing.…

Next-generation sequencing (NGS) is a pivotal technique in genome sequencing due to its high throughput, rapid results, cost-effectiveness, and enhanced accuracy. Its significance extends across various domains, playing a crucial role in…

Genomics · Quantitative Biology 2025-04-28 Fathima Nuzla Ismail , Shanika Amarasoma

Motivation: Data volumes generated by next-generation sequencing technolo- gies is now a major concern, both for storage and transmission. This triggered the need for more efficient methods than general purpose compression tools, such as…

Data Structures and Algorithms · Computer Science 2014-12-19 Gaëtan Benoit , Claire Lemaitre , Dominique Lavenier , Guillaume Rizk

Since its invention, Generative adversarial networks (GANs) have shown outstanding results in many applications. Generative Adversarial Networks are powerful yet, resource-hungry deep-learning models. Their main difference from ordinary…

Machine Learning · Computer Science 2021-08-17 Dina Tantawy , Mohamed Zahran , Amr Wassal

Modern biological science produces vast amounts of genomic sequence data. This is fuelling the need for efficient algorithms for sequence compression and analysis. Data compression and the associated techniques coming from information…

Data Structures and Algorithms · Computer Science 2011-09-05 Heba Afify , Muhammad Islam , Manal Abdel Wahed

Motivation The Burrows-Wheeler transform (BWT) is the foundation of many algorithms for compression and indexing of text data, but the cost of computing the BWT of very large string collections has prevented these techniques from being…

Data Structures and Algorithms · Computer Science 2015-03-20 Anthony J. Cox , Markus J. Bauer , Tobias Jakobi , Giovanna Rosone

Genome and metagenome comparisons based on large amounts of next-generation sequencing (NGS) data pose significant challenges for alignment-based approaches due to the huge data size and the relatively short length of the reads.…

Quantitative Methods · Quantitative Biology 2018-03-28 Jie Ren , Xin Bai , Yang Young Lu , Kujin Tang , Ying Wang , Gesine Reinert , Fengzhu Sun

Genome sequence analysis has enabled significant advancements in medical and scientific areas such as personalized medicine, outbreak tracing, and the understanding of evolution. Unfortunately, it is currently bottlenecked by the…

‹ Prev 1 2 3 10 Next ›