English
Related papers

Related papers: Structure-Informed Protein Language Model

200 papers

Protein language models (pLMs) pre-trained on vast protein sequence databases excel at various downstream tasks but often lack the structural knowledge essential for some biological applications. To address this, we introduce a method to…

Understanding the relationships between protein sequence, structure and function is a long-standing biological challenge with manifold implications from drug design to our understanding of evolution. Recently, protein language models have…

Quantitative Methods · Quantitative Biology 2024-01-29 Dexiong Chen , Philip Hartout , Paolo Pellizzoni , Carlos Oliver , Karsten Borgwardt

Learning effective protein representations is critical in a variety of tasks in biology such as predicting protein functions. Recent sequence representation learning methods based on Protein Language Models (PLMs) excel in sequence-based…

Quantitative Methods · Quantitative Biology 2023-10-19 Zuobai Zhang , Chuanrui Wang , Minghao Xu , Vijil Chenthamarakshan , Aurélie Lozano , Payel Das , Jian Tang

Learning effective protein representations is critical in a variety of tasks in biology such as predicting protein function or structure. Existing approaches usually pretrain protein language models on a large number of unlabeled amino acid…

Machine Learning · Computer Science 2023-01-31 Zuobai Zhang , Minghao Xu , Arian Jamasb , Vijil Chenthamarakshan , Aurelie Lozano , Payel Das , Jian Tang

Protein representation learning methods have shown great potential to yield useful representation for many downstream tasks, especially on protein classification. Moreover, a few recent studies have shown great promise in addressing…

Machine Learning · Computer Science 2023-04-11 Can Chen , Jingbo Zhou , Fan Wang , Xue Liu , Dejing Dou

This paper demonstrates that language models are strong structure-based protein designers. We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs), that have learned massive sequential…

Machine Learning · Computer Science 2023-02-10 Zaixiang Zheng , Yifan Deng , Dongyu Xue , Yi Zhou , Fei YE , Quanquan Gu

Geometric deep learning has recently achieved great success in non-Euclidean domains, and learning on 3D structures of large biomolecules is emerging as a distinct research area. However, its efficacy is largely constrained due to the…

Machine Learning · Computer Science 2023-10-31 Fang Wu , Lirong Wu , Dragomir Radev , Jinbo Xu , Stan Z. Li

Deep neural-network-based language models (LMs) are increasingly applied to large-scale protein sequence data to predict protein function. However, being largely black-box models and thus challenging to interpret, current protein LM…

Quantitative Methods · Quantitative Biology 2024-08-06 Mai Ha Vu , Rahmad Akbar , Philippe A. Robert , Bartlomiej Swiatczak , Victor Greiff , Geir Kjetil Sandve , Dag Trygve Truslew Haug

Protein structures are important for understanding their functions and interactions. Currently, many protein structure prediction methods are enriching the structure database. Discriminating the origin of structures is crucial for…

Biomolecules · Quantitative Biology 2024-10-24 Wenrui Gou , Wenhui Ge , Yang Tan , Mingchen Li , Guisheng Fan , Huiqun Yu

Protein-specific large language models (Protein LLMs) are revolutionizing protein science by enabling more efficient protein structure prediction, function annotation, and design. While existing surveys focus on specific aspects or…

Understanding protein structure-function relationships is a key challenge in computational biology, with applications across the biotechnology and pharmaceutical industries. While it is known that protein structure directly impacts protein…

Biomolecules · Quantitative Biology 2020-11-02 Nicolas Swenson , Aditi S. Krishnapriyan , Aydin Buluc , Dmitriy Morozov , Katherine Yelick

Understanding biological processes, drug development, and biotechnological advancements requires a detailed analysis of protein structures and functions, a task that is inherently complex and time-consuming in traditional protein research.…

Artificial Intelligence · Computer Science 2025-04-21 Yijia Xiao , Edward Sun , Yiqiao Jin , Qifan Wang , Wei Wang

Protein is linked to almost every life process. Therefore, analyzing the biological structure and property of protein sequences is critical to the exploration of life, as well as disease detection and drug discovery. Traditional protein…

Machine Learning · Computer Science 2021-12-08 Yijia Xiao , Jiezhong Qiu , Ziang Li , Chang-Yu Hsieh , Jie Tang

Proteins adopt multiple structural conformations to perform their diverse biological functions, and understanding these conformations is crucial for advancing drug discovery. Traditional physics-based simulation methods often struggle with…

Biomolecules · Quantitative Biology 2025-03-14 Jiarui Lu , Xiaoyin Chen , Stephen Zhewen Lu , Chence Shi , Hongyu Guo , Yoshua Bengio , Jian Tang

For protein sequence datasets, unlabeled data has greatly outpaced labeled data due to the high cost of wet-lab characterization. Recent deep-learning approaches to protein prediction have shown that pre-training on unlabeled data can yield…

Machine Learning · Computer Science 2020-12-02 Pascal Sturmfels , Jesse Vig , Ali Madani , Nazneen Fatema Rajani

Protein structure prediction remains a challenge in the field of computational biology. Traditional protein structure prediction approaches include template-based modelling (say, homology modelling, and threading), and ab initio. A…

Other Quantitative Biology · Quantitative Biology 2015-07-14 Jianwei Zhu , Haicang Zhang , Chao Wang , Bin Ling , Wei-Mou Zheng , Dongbo Bu

Protein language models (PLMs) have shown promise in improving the understanding of protein sequences, contributing to advances in areas such as function prediction and protein engineering. However, training these models from scratch…

Machine Learning · Computer Science 2024-12-19 Shivasankaran Vanaja Pandi , Bharath Ramsundar

Proteins inherently possess a consistent sequence-structure duality. The abundance of protein sequence data, which can be readily represented as discrete tokens, has driven fruitful developments in protein language models (pLMs). A key…

Computational Engineering, Finance, and Science · Computer Science 2026-05-29 Yi Zhou , Haohao Qu , Yunqing Liu , Shanru Lin , Le Song , Wenqi Fan

Modern Protein Language Models (PLMs) apply transformer-based model architectures from natural language processing to biological sequences, predicting a variety of protein functions and properties. However, protein language has key…

Machine Learning · Computer Science 2026-02-25 Anna Hart , Chi Han , Jeonghwan Kim , Huimin Zhao , Heng Ji

Protein structure tokenization converts 3D structures into discrete or vectorized representations, enabling the integration of structural and sequence data. Despite many recent works on structure tokenization, the properties of the…

Machine Learning · Computer Science 2025-11-14 Zijing Liu , Bin Feng , He Cao , Yu Li
‹ Prev 1 2 3 10 Next ›