Related papers: LUMPY: A probabilistic framework for structural va…

A Note on Optimal Sampling Strategy for Structural Variant Detection Using Optical Mapping

Structural variants compose the majority of human genetic variation, but are difficult to assess using current genomic sequencing technologies. Optical mapping technologies, which measure the size of chromosomal fragments between labeled…

Quantitative Methods · Quantitative Biology 2019-10-10 Weiwei Li , Jan Hannig , Corbin Jones

Bayesian Variable Selection with Structure Learning: Applications in Integrative Genomics

Significant advances in biotechnology have allowed for simultaneous measurement of molecular data points across multiple genomic and transcriptomic levels from a single tumor/cancer sample. This has motivated systematic approaches to…

Methodology · Statistics 2015-08-13 Suprateek Kundu , Minsuk Shin , Yichen Cheng , Ganiraju Manyam , Bani K. Mallick , Veera Baladandayuthapani

Integrated Bayesian non-parametric spatial modeling for cross-sample identification of spatially variable genes

Spatial transcriptomics has revolutionized tissue analysis by simultaneously mapping gene expression, spatial topography, and histological context across consecutive tissue sections, enabling systematic investigation of spatial…

Applications · Statistics 2025-10-24 Meng Zhou , Shuangge Ma , Mengyun Wu

Representing and decomposing genomic structural variants as balanced integer flows on sequence graphs

The study of genomic variation has provided key insights into the functional role of mutations. Predominantly, studies have focused on single nucleotide variants (SNV), which are relatively easy to detect and can be described with rich…

Genomics · Quantitative Biology 2015-09-04 Daniel R. Zerbino , Tracy Ballinger , Benedict Paten , Glenn Hickey , David Haussler

Probabilistic Robustness Analysis in High Dimensional Space: Application to Semantic Segmentation Network

Semantic segmentation networks (SSNs) are central to safety-critical applications such as medical imaging and autonomous driving, where robustness under uncertainty is essential. However, existing probabilistic verification methods often…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Navid Hashemi , Samuel Sasaki , Diego Manzanas Lopez , Lars Lindemann , Ipek Oguz , Meiyi Ma , Taylor T. Johnson

Integrating Large Language Models for Genetic Variant Classification

The classification of genetic variants, particularly Variants of Uncertain Significance (VUS), poses a significant challenge in clinical genetics and precision medicine. Large Language Models (LLMs) have emerged as transformative tools in…

Genomics · Quantitative Biology 2024-11-11 Youssef Boulaimen , Gabriele Fossi , Leila Outemzabet , Nathalie Jeanray , Oleksandr Levenets , Stephane Gerart , Sebastien Vachenc , Salvatore Raieli , Joanna Giemza

Probabilistic Inference for Structural Health Monitoring: New Modes of Learning from Data

In data-driven SHM, the signals recorded from systems in operation can be noisy and incomplete. Data corresponding to each of the operational, environmental, and damage states are rarely available a priori; furthermore, labelling to…

Machine Learning · Statistics 2021-03-03 Lawrence A. Bull , Paul Gardner , Timothy J. Rogers , Elizabeth J. Cross , Nikolaos Dervilis , Keith Worden

CLEVER: Clique-Enumerating Variant Finder

Next-generation sequencing techniques have facilitated a large scale analysis of human genetic variation. Despite the advances in sequencing speeds, the computational discovery of structural variants is not yet standard. It is likely that…

Genomics · Quantitative Biology 2015-01-14 Tobias Marschall , Ivan Costa , Stefan Canzar , Markus Bauer , Gunnar Klau , Alexander Schliep , Alexander Schönhuth

Cancer classification and pathway discovery using non-negative matrix factorization

Extracting genetic information from a full range of sequencing data is important for understanding diseases. We propose a novel method to effectively explore the landscape of genetic mutations and aggregate them to predict cancer type. We…

Genomics · Quantitative Biology 2018-10-10 Zexian Zeng , Andy Vo , Chengsheng Mao , Susan E Clare , Seema A Khan , Yuan Luo

Sensitive Long-Indel-Aware Alignment of Sequencing Reads

The tremdendous advances in high-throughput sequencing technologies have made population-scale sequencing as performed in the 1000 Genomes project and the Genome of the Netherlands project possible. Next-generation sequencing has allowed…

Genomics · Quantitative Biology 2013-03-15 Tobias Marschall , Alexander Schönhuth

High Dimensional Classification with combined Adaptive Sparse PLS and Logistic Regression

Motivation: The high dimensionality of genomic data calls for the development of specific classification methodologies, especially to prevent over-optimistic predictions. This challenge can be tackled by compression and variable selection,…

Methodology · Statistics 2021-04-10 G. Durif , L. Modolo , J. Michaelsson , J. E. Mold , S. Lambert-Lacroix , F. Picard

Spatial clustering of array CGH features in combination with hierarchical multiple testing

We propose a new approach for clustering DNA features using array CGH data from multiple tumor samples. We distinguish data-collapsing: joining contiguous DNA clones or probes with extremely similar data into regions, from clustering:…

Applications · Statistics 2010-12-21 Kyung In Kim , Etienne Roquain , Mark Van De Wiel

Integrative Learning of Structured High-Dimensional Data from Multiple Datasets

Integrative learning of multiple datasets has the potential to mitigate the challenge of small $n$ and large $p$ that is often encountered in analysis of big biomedical data such as genomics data. Detection of weak yet important signals can…

Methodology · Statistics 2022-07-04 Changgee Chang , Zongyu Dai , Jihwan Oh , Qi Long

A Modular Open Source Framework for Genomic Variant Calling

Variant calling is a fundamental task in genomic research, essential for detecting genetic variations such as single nucleotide polymorphisms (SNPs) and insertions or deletions (indels). This paper presents an enhancement to DeepChem, a…

Quantitative Methods · Quantitative Biology 2025-07-29 Ankita Vaishnobi Bisoi , Shreyas V , Jose Siguenza , Bharath Ramsundar

Randomized Methods for Design of Uncertain Systems: Sample Complexity and Sequential Algorithms

In this paper, we study randomized methods for feedback design of uncertain systems. The first contribution is to derive the sample complexity of various constrained control problems. In particular, we show the key role played by the…

Systems and Control · Computer Science 2014-07-22 T. Alamo , R. Tempo , A. Luque , D. R. Ramirez

Structured variable selection in support vector machines

When applying the support vector machine (SVM) to high-dimensional classification problems, we often impose a sparse structure in the SVM to eliminate the influences of the irrelevant predictors. The lasso and other variable selection…

Machine Learning · Statistics 2008-02-22 Seongho Wu , Hui Zou , Ming Yuan

Learning protein sequence embeddings using information from structure

Inferring the structural properties of a protein from its amino acid sequence is a challenging yet important problem in biology. Structures are not known for the vast majority of protein sequences, but structure is critical for…

Machine Learning · Computer Science 2019-10-17 Tristan Bepler , Bonnie Berger

Learning a Loopy Model For Semantic Segmentation Exactly

Learning structured models using maximum margin techniques has become an indispensable tool for com- puter vision researchers, as many computer vision applications can be cast naturally as an image labeling problem. Pixel-based or…

Machine Learning · Computer Science 2013-09-17 Andreas Christian Mueller , Sven Behnke

A Simple Data-Adaptive Probabilistic Variant Calling Model

Background: Several sources of noise obfuscate the identification of single nucleotide variation (SNV) in next generation sequencing data. For instance, errors may be introduced during library construction and sequencing steps. In addition,…

Genomics · Quantitative Biology 2015-03-05 Steve Hoffmann , Peter F. Stadler , Korbinian Strimmer

Application of Support Vector Machine to detect an association between a disease or trait and multiple SNP variations

After the completion of human genome sequence was anounced, it is evident that interpretation of DNA sequences is an immediate task to work on. For understanding their signals, improvement of present sequence analysis tools and developing…

Computational Complexity · Computer Science 2007-05-23 Gene Kim , MyungHo Kim