Related papers: Learning quantitative sequence-function relationsh…

Inferring interaction partners from protein sequences using mutual information

Functional protein-protein interactions are crucial in most cellular processes. They enable multi-protein complexes to assemble and to remain stable, and they allow signal transduction in various pathways. Functional interactions between…

Biomolecules · Quantitative Biology 2018-11-14 Anne-Florence Bitbol

Parametric inference in the large data limit using maximally informative models

Motivated by data-rich experiments in transcriptional regulation and sensory neuroscience, we consider the following general problem in statistical inference. When exposed to a high-dimensional signal S, a system of interest computes a…

Quantitative Methods · Quantitative Biology 2013-12-16 Justin B. Kinney , Gurinder S. Atwal

An Interdisciplinary Comparison of Sequence Modeling Methods for Next-Element Prediction

Data of sequential nature arise in many application domains in forms of, e.g. textual data, DNA sequences, and software execution traces. Different research disciplines have developed methods to learn sequence models from such datasets: (i)…

Machine Learning · Statistics 2018-11-02 Niek Tax , Irene Teinemaa , Sebastiaan J. van Zelst

A Bayesian Semiparametric Approach to Learning About Gene-Gene Interactions in Case-Control Studies

Gene-gene interactions are often regarded as playing significant roles in influencing variabilities of complex traits. Although much research has been devoted to this area, to date a comprehensive statistical model that addresses the…

Applications · Statistics 2018-04-18 Durba Bhattacharya , Sourabh Bhattacharya

Finite Width Model Sequence Comparison

Sequence comparison is a widely used computational technique in modern molecular biology. In spite of the frequent use of sequence comparisons the important problem of assigning statistical significance to a given degree of similarity is…

Quantitative Methods · Quantitative Biology 2007-05-23 Ralf Bundschuh , Nicholas Chia

Sequence alignment and mutual information

Background: Alignment of biological sequences such as DNA, RNA or proteins is one of the most widely used tools in computational bioscience. All existing alignment algorithms rely on heuristic scoring schemes based on biological expertise.…

Genomics · Quantitative Biology 2008-10-27 Orion Penner , Peter Grassberger , Maya Paczuski

Neuronal Sequence Models for Bayesian Online Inference

Sequential neuronal activity underlies a wide range of processes in the brain. Neuroscientific evidence for neuronal sequences has been reported in domains as diverse as perception, motor control, speech, spatial navigation and memory.…

Adaptation and Self-Organizing Systems · Physics 2020-04-03 Sascha Frölich , Dimitrije Marković , Stefan J. Kiebel

Multivariate dependence and genetic networks inference

A critical task in systems biology is the identification of genes that interact to control cellular processes by transcriptional activation of a set of target genes. Many methods have been developed to use statistical correlations in…

Quantitative Methods · Quantitative Biology 2010-11-24 Adam A. Margolin , Kai Wang , Andrea Califano , Ilya Nemenman

Deep generative models of genetic variation capture mutation effects

The functions of proteins and RNAs are determined by a myriad of interactions between their constituent residues, but most quantitative models of how molecular phenotype depends on genotype must approximate this by simple additive effects.…

Quantitative Methods · Quantitative Biology 2017-12-19 Adam J. Riesselman , John B. Ingraham , Debora S. Marks

The Promises of Parallel Outcomes

A key challenge in causal inference from observational studies is the identification and estimation of causal effects in the presence of unmeasured confounding. In this paper, we introduce a novel approach for causal inference that…

Methodology · Statistics 2022-10-17 Ying Zhou , Dingke Tang , Dehan Kong , Linbo Wang

Biological Sequence Kernels with Guaranteed Flexibility

Applying machine learning to biological sequences - DNA, RNA and protein - has enormous potential to advance human health, environmental sustainability, and fundamental biological understanding. However, many existing machine learning…

Machine Learning · Statistics 2023-04-11 Alan Nawzad Amin , Eli Nathan Weinstein , Debora Susan Marks

SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation

Proteins, essential to biological systems, perform functions intricately linked to their three-dimensional structures. Understanding the relationship between protein structures and their amino acid sequences remains a core challenge in…

Quantitative Methods · Quantitative Biology 2024-11-04 Liang He , Peiran Jin , Yaosen Min , Shufang Xie , Lijun Wu , Tao Qin , Xiaozhuan Liang , Kaiyuan Gao , Yuliang Jiang , Tie-Yan Liu

Foundational principles for large scale inference: Illustrations through correlation mining

When can reliable inference be drawn in the "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large scale inference. In large…

Statistics Theory · Mathematics 2015-05-19 Alfred O. Hero , Bala Rajaratnam

Sensory Polymorphism and Behavior: When Machine Vision Meets Monkey Eyes

Polymorphism in the peripheral sensory system (e.g., congenital individual differences in photopigment configuration) is important in diverse research fields, ranging from evolutionary biology to engineering, because of its potential…

Neurons and Cognition · Quantitative Biology 2017-01-10 Satohiro Tajima

Inverse Statistical Physics of Protein Sequences: A Key Issues Review

In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques,…

Biomolecules · Quantitative Biology 2019-10-07 Simona Cocco , Christoph Feinauer , Matteo Figliuzzi , Remi Monasson , Martin Weigt

Learning Quantum-Samplers for Stochastic Processes with Quantum Sequence Models

Quantum circuits that generate coherent superpositions of stochastic processes are key to many downstream quantum-accelerated tasks, such as risk analysis, importance sampling, and DNA sequencing. However, traditional methods for designing…

Quantum Physics · Physics 2026-03-26 Ximing Wang , Chengran Yang , Chidambaram Aditya Somasundaram , Jayne Thompson , Mile Gu

Generative models versus underlying symmetries to explain biological pattern

Mathematical models play an increasingly important role in the interpretation of biological experiments. Studies often present a model that generates the observations, connecting hypothesized process to an observed pattern. Such generative…

Populations and Evolution · Quantitative Biology 2014-06-18 Steven A. Frank

Finding Sequential Patterns from Large Sequence Data

Data mining is the task of discovering interesting patterns from large amounts of data. There are many data mining tasks, such as classification, clustering, association rule mining, and sequential pattern mining. Sequential pattern mining…

Databases · Computer Science 2010-02-08 Mahdi Esmaeili , Fazekas Gabor

Comparing Apples to Oranges: Learning Similarity Functions for Data Produced by Different Distributions

Similarity functions measure how comparable pairs of elements are, and play a key role in a wide variety of applications, e.g., notions of Individual Fairness abiding by the seminal paradigm of Dwork et al., as well as Clustering problems.…

Machine Learning · Computer Science 2023-10-24 Leonidas Tsepenekas , Ivan Brugere , Freddy Lecue , Daniele Magazzeni

Frequency Domain Statistical Inference for High-Dimensional Time Series

Analyzing time series in the frequency domain enables the development of powerful tools for investigating the second-order characteristics of multivariate processes. Parameters like the spectral density matrix and its inverse, the coherence…

Methodology · Statistics 2024-01-19 Jonas Krampe , Efstathios Paparoditis