English
Related papers

Related papers: AnySeq: A High Performance Sequence Alignment Libr…

200 papers

In recent years, the rapidly increasing number of reads produced by next-generation sequencing (NGS) technologies has driven the demand for efficient implementations of sequence alignments in bioinformatics. However, current…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-05-17 André Müller , Bertil Schmidt , Richard Membarth , Roland Leißa , Sebastian Hack

The use of deep learning models in computational biology has increased massively in recent years, and it is expected to continue with the current advances in the fields such as Natural Language Processing. These models, although able to…

State-of-the-art multiple sequence alignment (MSA) algorithms are based on progressive approaches that rely on pairwise sequence alignment (PSA) to generate guide trees to align all sequences. Given an evidenced explosion in genomic data…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-28 Miguel Graça , Aleksandar Ilic

Recent advances in computational methods for designing biological sequences have sparked the development of metrics to evaluate these methods performance in terms of the fidelity of the designed sequences to a target distribution and their…

PyamilySeq is a Python-based tool designed for interpretable gene clustering and pangenomic inference, supporting analyses at both species and genus levels. It facilitates the clustering of gene sequences into families based on sequence…

Genomics · Quantitative Biology 2024-07-30 Nicholas J. Dimonaco

fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and…

Computation and Language · Computer Science 2019-04-03 Myle Ott , Sergey Edunov , Alexei Baevski , Angela Fan , Sam Gross , Nathan Ng , David Grangier , Michael Auli

Motivation: Recent advances in sequencing technologies promise ultra-long reads of $\sim$100 kilo bases (kb) in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 mega bases (Mb) in length. Existing…

Genomics · Quantitative Biology 2018-09-17 Heng Li

Past work in natural language processing interpretability focused mainly on popular classification tasks while largely overlooking generation settings, partly due to a lack of dedicated tools. In this work, we introduce Inseq, a Python…

Computation and Language · Computer Science 2023-09-08 Gabriele Sarti , Nils Feldhus , Ludwig Sickert , Oskar van der Wal , Malvina Nissim , Arianna Bisazza

Sequences of nucleotides (for DNA and RNA) or amino acids (for proteins) are central objects in biology. Among the most important computational problems is that of sequence alignment, i.e. arranging sequences from different organisms in…

Quantitative Methods · Quantitative Biology 2020-12-08 Anna Paola Muntoni , Andrea Pagnani , Martin Weigt , Francesco Zamponi

Genome sequences contain hundreds of millions of DNA base pairs. Finding the degree of similarity between two genomes requires executing a compute-intensive dynamic programming algorithm, such as Smith-Waterman. Traditional von Neumann…

Emerging Technologies · Computer Science 2019-01-21 Roman Kaplan , Leonid Yavits , Ran Ginosar

DNA sequence alignment is important today as it is usually the first step in finding gene mutation, evolutionary similarities, protein structure, drug development and cancer treatment. Covid-19 is one recent example. There are many…

Genomics · Quantitative Biology 2023-06-01 Suchindra , Preetam Nagaraj

Designing novel protein sequences for a desired 3D topological fold is a fundamental yet non-trivial task in protein engineering. Challenges exist due to the complex sequence--fold relationship, as well as the difficulties to capture the…

Machine Learning · Computer Science 2021-06-25 Yue Cao , Payel Das , Vijil Chenthamarakshan , Pin-Yu Chen , Igor Melnyk , Yang Shen

Ancestral sequence reconstruction is a key task in computational biology. It consists in inferring a molecular sequence at an ancestral species of a known phylogeny, given descendant sequences at the tip of the tree. In addition to its many…

Populations and Evolution · Quantitative Biology 2022-07-27 Brandon Legried , Sebastien Roch

Motivation: The mapping of RNA-seq reads to their transcripts of origin is a fundamental task in transcript expression estimation and differential expression scoring. Where ambiguities in mapping exist due to transcripts sharing sequence,…

Genomics · Quantitative Biology 2015-01-28 James Hensman , Peter Glaus , Antti Honkela , Magnus Rattray

Evolutionary sparse learning (ESL) uses a supervised machine learning approach, Least Absolute Shrinkage and Selection Operator (LASSO), to build models explaining the relationship between a hypothesis and the variation across genomic…

Populations and Evolution · Quantitative Biology 2025-01-10 Maxwell Sanderford , Sudip Sharma , Glen Stecher , Jun Liu , Jieping Ye , Sudhir Kumar

Summary: Accurate phenotype prediction from genomic sequences is a highly coveted task in biological and medical research. While machine-learning holds the key to accurate prediction in a variety of fields, the complexity of biological data…

High throughput sequencing of RNA (RNA-Seq) can provide us with millions of short fragments of RNA transcripts from a sample. How to better recover the original RNA transcripts from those fragments (RNA-Seq assembly) is still a difficult…

Genomics · Quantitative Biology 2019-02-15 Shunfu Mao , Yihan Jiang , Edwin Basil Mathew , Sreeram Kannan

Artificial intelligence (AI) tools are gaining more and more ground each year in bioinformatics. Learning algorithms can be taught easily by using the existing enormous biological databases, and the resulting models can be used for the…

Biomolecules · Quantitative Biology 2017-08-15 Balazs Szalkai , Vince Grolmusz

Genome sequence analysis plays a pivotal role in enabling many medical and scientific advancements in personalized medicine, outbreak tracing, and forensics. However, the analysis of genome sequencing data is currently bottlenecked by the…

Hardware Architecture · Computer Science 2021-11-04 Damla Senol Cali

Computational complexity is a key limitation of genomic analyses. Thus, over the last 30 years, researchers have proposed numerous fast heuristic methods that provide computational relief. Comparing genomic sequences is one of the most…

‹ Prev 1 2 3 10 Next ›