Related papers: A generative angular model of protein structure ev…
The sequence of a protein is not only constrained by its physical and biochemical properties under current selection, but also by features of its past evolutionary history. Understanding the extent and the form that these evolutionary…
Proteins, by virtue of their central role in most biological processes, represent one of the key subjects of the study of molecular evolution. Inherent to the indispensability of proteins for living cells is the fact that a given protein…
We develop a path-based approach to continuous-time random walks on networks with arbitrarily weighted edges. We describe an efficient numerical algorithm for calculating statistical properties of the stochastic path ensemble. After…
Proteins, essential to biological systems, perform functions intricately linked to their three-dimensional structures. Understanding the relationship between protein structures and their amino acid sequences remains a core challenge in…
Generative modeling has become a central paradigm in protein research, extending machine learning beyond structure prediction toward sequence design, backbone generation, inverse folding, and biomolecular interaction modeling. However, the…
Proteins are macromolecules that mediate a significant fraction of the cellular processes that underlie life. An important task in bioengineering is designing proteins with specific 3D structures and chemical properties which enable…
In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques,…
The evolutionary trajectory of a protein through sequence space is constrained by function and three-dimensional (3D) structure. Residues in spatial proximity tend to co-evolve, yet attempts to invert the evolutionary record to identify…
Natural protein sequences somehow encode the structural forms that these molecules adopt. Recent developments in structure-prediction are agnostic to the mechanisms by which proteins fold and represent them as static objects. However, the…
During their evolution, proteins explore sequence space via an interplay between random mutations and phenotypic selection. Here we build upon recent progress in reconstructing data-driven fitness landscapes for families of homologous…
We present a sequence-based probabilistic formalism that directly addresses co-operative effects in networks of interacting positions in proteins, providing significantly improved contact prediction, as well as accurate quantitative…
Despite the importance of a thermodynamically stable structure with a conserved fold for protein function, almost all evolutionary models neglect site-site correlations that arise from physical interactions between neighboring amino acid…
It has recently been discovered that many biological systems, when represented as graphs, exhibit a scale-free topology. One such system is the set of structural relationships among protein domains. The scale-free nature of this and other…
We analyse a simple discrete-time stochastic process for the theoretical modeling of the evolution of protein lengths. At every step of the process a new protein is produced as a modification of one of the proteins already existing and its…
Studies of coevolution of amino acids within and between proteins have revealed two types of coevolving units: coevolving contacts, which are pairs of amino acids distant along the sequence but in contact in the three-dimensional structure,…
We introduce a data-driven epistatic model of protein evolution, capable of generating evolutionary trajectories spanning very different time scales reaching from individual mutations to diverged homologs. Our in silico evolution…
The analysis of the three-dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins and is more strongly conserved than amino acid sequence over…
Protein sequences serve as a natural record of the evolutionary constraints that shape their functional structures. We show that it is possible to use only sequence information to go beyond predicting native structures and global stability…
Inferring the structural properties of a protein from its amino acid sequence is a challenging yet important problem in biology. Structures are not known for the vast majority of protein sequences, but structure is critical for…
A Profile Mixture Model is a model of protein evolution, describing sequence data in which sites are assumed to follow many related substitution processes on a single evolutionary tree. The processes depend in part on different amino acid…