Related papers: Learning quantitative sequence-function relationsh…
Functional protein-protein interactions are crucial in most cellular processes. They enable multi-protein complexes to assemble and to remain stable, and they allow signal transduction in various pathways. Functional interactions between…
Motivated by data-rich experiments in transcriptional regulation and sensory neuroscience, we consider the following general problem in statistical inference. When exposed to a high-dimensional signal S, a system of interest computes a…
Data of sequential nature arise in many application domains in forms of, e.g. textual data, DNA sequences, and software execution traces. Different research disciplines have developed methods to learn sequence models from such datasets: (i)…
Gene-gene interactions are often regarded as playing significant roles in influencing variabilities of complex traits. Although much research has been devoted to this area, to date a comprehensive statistical model that addresses the…
Sequence comparison is a widely used computational technique in modern molecular biology. In spite of the frequent use of sequence comparisons the important problem of assigning statistical significance to a given degree of similarity is…
Background: Alignment of biological sequences such as DNA, RNA or proteins is one of the most widely used tools in computational bioscience. All existing alignment algorithms rely on heuristic scoring schemes based on biological expertise.…
Sequential neuronal activity underlies a wide range of processes in the brain. Neuroscientific evidence for neuronal sequences has been reported in domains as diverse as perception, motor control, speech, spatial navigation and memory.…
A critical task in systems biology is the identification of genes that interact to control cellular processes by transcriptional activation of a set of target genes. Many methods have been developed to use statistical correlations in…
The functions of proteins and RNAs are determined by a myriad of interactions between their constituent residues, but most quantitative models of how molecular phenotype depends on genotype must approximate this by simple additive effects.…
A key challenge in causal inference from observational studies is the identification and estimation of causal effects in the presence of unmeasured confounding. In this paper, we introduce a novel approach for causal inference that…
Applying machine learning to biological sequences - DNA, RNA and protein - has enormous potential to advance human health, environmental sustainability, and fundamental biological understanding. However, many existing machine learning…
Proteins, essential to biological systems, perform functions intricately linked to their three-dimensional structures. Understanding the relationship between protein structures and their amino acid sequences remains a core challenge in…
When can reliable inference be drawn in the "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large scale inference. In large…
Polymorphism in the peripheral sensory system (e.g., congenital individual differences in photopigment configuration) is important in diverse research fields, ranging from evolutionary biology to engineering, because of its potential…
In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques,…
Quantum circuits that generate coherent superpositions of stochastic processes are key to many downstream quantum-accelerated tasks, such as risk analysis, importance sampling, and DNA sequencing. However, traditional methods for designing…
Mathematical models play an increasingly important role in the interpretation of biological experiments. Studies often present a model that generates the observations, connecting hypothesized process to an observed pattern. Such generative…
Data mining is the task of discovering interesting patterns from large amounts of data. There are many data mining tasks, such as classification, clustering, association rule mining, and sequential pattern mining. Sequential pattern mining…
Similarity functions measure how comparable pairs of elements are, and play a key role in a wide variety of applications, e.g., notions of Individual Fairness abiding by the seminal paradigm of Dwork et al., as well as Clustering problems.…
Analyzing time series in the frequency domain enables the development of powerful tools for investigating the second-order characteristics of multivariate processes. Parameters like the spectral density matrix and its inverse, the coherence…