Related papers: BOAssembler: a Bayesian Optimization Framework to …
Motivation: Transcriptome sequencing has long been the favored method for quickly and inexpensively obtaining the sequences for a large number of genes from an organism with no reference genome. With the rapidly increasing throughputs and…
RNA-seq allows detection and precise quantification of transcripts, provides comprehensive understanding of exon/intron boundaries, aids discovery of alternatively spliced isoforms and fusion transcripts along with measurement of…
De novo genome assembly is challenging in highly repetitive regions; however, reference-guided assemblers often suffer from bias. We propose a framework for pangenome-guided sequence assembly, which can resolve short-read data in complex…
Motivation: The mapping of RNA-seq reads to their transcripts of origin is a fundamental task in transcript expression estimation and differential expression scoring. Where ambiguities in mapping exist due to transcripts sharing sequence,…
The study of functional genomics--particularly in non-model organisms has been dramatically improved over the last few years by use of transcriptomes and RNAseq. While these studies are potentially extremely powerful, a computationally…
High throughput technologies have become the practice of choice for comparative studies in biomedical applications. Limited number of sample points due to sequencing cost or access to organisms of interest necessitates the development of…
Motivation: Assigning RNA-seq reads to their transcript of origin is a fundamental task in transcript expression estimation. Where ambiguities in assignments exist due to transcripts sharing sequence, e.g. alternative isoforms or alleles,…
Assessing the correctness of genome assemblies is an important step in any genome project. Several methods exist, but most are computationally intensive and, in some cases, inappropriate. Here I present baa.pl, a fast and easy-to-use…
Bayesian optimization (BO) is a sample efficient approach to automatically tune the hyperparameters of machine learning models. In practice, one frequently has to solve similar hyperparameter tuning problems sequentially. For example, one…
RNA-sequencing (RNA-seq) has become an exemplar technology in modern biology and clinical applications over the past decade. It has gained immense popularity in the recent years driven by continuous efforts of the bioinformatics community…
High read depth can be used to assemble short sequence repeats. The existing genome assemblers fail in repetitive regions of longer than average read. I propose a new algorithm for a DNA assembly which uses the relative frequency of reads…
Recently, ultra high-throughput sequencing of RNA (RNA-Seq) has been developed as an approach for analysis of gene expression. By obtaining tens or even hundreds of millions of reads of transcribed sequences, an RNA-Seq experiment can offer…
Transcriptome assembly from RNA-Seq reads is an active area of bioinformatics research. The ever-declining cost and the increasing depth of RNA-Seq have provided unprecedented opportunities to better identify expressed transcripts. However,…
Recent advances in molecular biology allow the quantification of the transcriptome and scoring transcripts as differentially or equally expressed between two biological conditions. Although these two tasks are closely linked, the available…
RNA sequencing (RNA-seq) is the conventional genome-scale approach used to capture the expression levels of all detectable genes in a biological sample. This is now regularly used for population-based studies designed to identify genetic…
RNA-Seq data characteristically exhibits large variances, which need to be appropriately accounted for in the model. We first explore the effects of this variability on the maximum likelihood estimator (MLE) of the overdispersion parameter…
We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be…
Large biological datasets are being produced at a rapid pace and create substantial storage challenges, particularly in the domain of high-throughput sequencing (HTS). Most approaches currently used to store HTS data are either unable to…
Despite their theoretical appealingness, Bayesian neural networks (BNNs) are left behind in real-world adoption, mainly due to persistent concerns on their scalability, accessibility, and reliability. In this work, we develop the…
Single cell combinatorial indexing RNA sequencing (sci-RNA-seq) is a powerful method for recovering gene expression data from an exponentially scalable number of individual cells or nuclei. However, sci-RNA-seq is a complex protocol that…