Related papers: Machine learning methods to study sequence-ensembl…
A protein's function depends critically on its conformational ensemble, a collection of energy weighted structures whose balance depends on temperature and environment. Though recent deep learning (DL) methods have substantially advanced…
Proteins, essential to biological systems, perform functions intricately linked to their three-dimensional structures. Understanding the relationship between protein structures and their amino acid sequences remains a core challenge in…
Intrinsically disordered proteins (IDPs) constitute a broad set of proteins with few uniting and many diverging properties. IDPs-and intrinsically disordered regions (IDRs) interspersed between folded domains-are generally characterized as…
Machine-learning models that learn from data to predict how protein sequence encodes function are emerging as a useful protein engineering tool. However, when using these models to suggest new protein designs, one must deal with the vast…
Disordered proteins play essential roles in myriad cellular processes, yet their structural characterization remains a major challenge due to their dynamic and heterogeneous nature. We here present a community-driven initiative to address…
Advances in deep learning have opened an era of abundant and accurate predicted protein structures; however, similar progress in protein ensembles has remained elusive. This review highlights several recent research directions towards…
Many aspects of the study of protein folding and dynamics have been affected by the recent advances in machine learning. Methods for the prediction of protein structures from their sequences are now heavily based on machine learning tools.…
Machine learning (ML)-guided directed evolution is a new paradigm for biological design that enables optimization of complex functions. ML methods use data to predict how sequence maps to function without requiring a detailed model of the…
Protein design has the potential to revolutionize biotechnology and medicine. While most efforts have focused on proteins with well-defined structures, increased recognition of the functional significance of intrinsically disordered…
Protein engineering seeks to identify protein sequences with optimized properties. When guided by machine learning, protein sequence generation methods can draw on prior knowledge and experimental efforts to improve this process. In this…
Intrinsically disordered proteins and regions are increasingly appreciated for their abundance in the proteome and the many functional roles they play in the cell. In this short review, we describe a variety of approaches used to obtain…
Numerous cellular functions rely on protein$\unicode{x2013}$protein interactions. Efforts to comprehensively characterize them remain challenged however by the diversity of molecular recognition mechanisms employed within the proteome. Deep…
Directed evolution of proteins has been the most effective method for protein engineering. However, a new paradigm is emerging, fusing the library generation and screening approaches of traditional directed evolution with computation…
Composed of amino acid chains that influence how they fold and thus dictating their function and features, proteins are a class of macromolecules that play a central role in major biological processes and are required for the structure,…
Amino acid sequence portrays most intrinsic form of a protein and expresses primary structure of protein. The order of amino acids in a sequence enables a protein to acquire a particular stable conformation that is responsible for the…
Learning effective protein representations is critical in a variety of tasks in biology such as predicting protein functions. Recent sequence representation learning methods based on Protein Language Models (PLMs) excel in sequence-based…
Mapping between sequence and structure is currently an open problem in structural biology. Despite many experimental and computational efforts it is not clear yet how the structure is encoded in the sequence. Answering this question may…
Proteins created by combinatorial methods in vitro are an important source of information for understanding sequence-structure-function relationships. Alignments of folded proteins from combinatorial libraries can be analyzed using methods…
Deep generative models that learn from the distribution of natural protein sequences and structures may enable the design of new proteins with valuable functions. While the majority of today's models focus on generating either sequences or…
Deep learning is catalyzing a scientific revolution fueled by big data, accessible toolkits, and powerful computational resources, impacting many fields including protein structural modeling. Protein structural modeling, such as predicting…