Related papers: Structure-aware Protein Self-supervised Learning

Geometric Self-Supervised Pretraining on 3D Protein Structures using Subgraphs

Protein representation learning aims to learn informative protein embeddings capable of addressing crucial biological questions, such as protein function prediction. Although sequence-based transformer models have shown promising results by…

Quantitative Methods · Quantitative Biology 2024-10-22 Michail Chatzianastasis , Yang Zhang , George Dasoulas , Michalis Vazirgiannis

Protein Representation Learning by Geometric Structure Pretraining

Learning effective protein representations is critical in a variety of tasks in biology such as predicting protein function or structure. Existing approaches usually pretrain protein language models on a large number of unlabeled amino acid…

Machine Learning · Computer Science 2023-01-31 Zuobai Zhang , Minghao Xu , Arian Jamasb , Vijil Chenthamarakshan , Aurelie Lozano , Payel Das , Jian Tang

Deep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Prediction

Predicting protein secondary structure is a fundamental problem in protein structure prediction. Here we present a new supervised generative stochastic network (GSN) based method to predict local secondary structure with deep hierarchical…

Quantitative Methods · Quantitative Biology 2014-03-07 Jian Zhou , Olga G. Troyanskaya

A Systematic Study of Joint Representation Learning on Protein Sequences and Structures

Learning effective protein representations is critical in a variety of tasks in biology such as predicting protein functions. Recent sequence representation learning methods based on Protein Language Models (PLMs) excel in sequence-based…

Quantitative Methods · Quantitative Biology 2023-10-19 Zuobai Zhang , Chuanrui Wang , Minghao Xu , Vijil Chenthamarakshan , Aurélie Lozano , Payel Das , Jian Tang

GOProteinGNN: Leveraging Protein Knowledge Graphs for Protein Representation Learning

Proteins play a vital role in biological processes and are indispensable for living organisms. Accurate representation of proteins is crucial, especially in drug development. Recently, there has been a notable increase in interest in…

Biomolecules · Quantitative Biology 2026-05-28 Dan Kalifa , Uriel Singer , Kira Radinsky

Protein Secondary Structure Prediction Using 3D Graphs and Relation-Aware Message Passing Transformers

In this study, we tackle the challenging task of predicting secondary structures from protein primary sequences, a pivotal initial stride towards predicting tertiary structures, while yielding crucial insights into protein activity,…

Machine Learning · Computer Science 2025-11-18 Disha Varshney , Samarth Garg , Sarthak Tyagi , Deeksha Varshney , Nayan Deep , Asif Ekbal

Structure-Aligned Protein Language Model

Protein language models (pLMs) pre-trained on vast protein sequence databases excel at various downstream tasks but often lack the structural knowledge essential for some biological applications. To address this, we introduce a method to…

Machine Learning · Computer Science 2025-12-18 Can Chen , David Heurtel-Depeiges , Robert M. Vernon , Christopher James Langmead , Yoshua Bengio , Quentin Fournier

Self-supervised Learning and Graph Classification under Heterophily

Self-supervised learning has shown its promising capability in graph representation learning in recent work. Most existing pre-training strategies usually choose the popular Graph neural networks (GNNs), which can be seen as a special form…

Machine Learning · Computer Science 2023-06-16 Yilin Ding , Zhen Liu , Hao Hao

Endowing Protein Language Models with Structural Knowledge

Understanding the relationships between protein sequence, structure and function is a long-standing biological challenge with manifold implications from drug design to our understanding of evolution. Recently, protein language models have…

Quantitative Methods · Quantitative Biology 2024-01-29 Dexiong Chen , Philip Hartout , Paolo Pellizzoni , Carlos Oliver , Karsten Borgwardt

PersGNN: Applying Topological Data Analysis and Geometric Deep Learning to Structure-Based Protein Function Prediction

Understanding protein structure-function relationships is a key challenge in computational biology, with applications across the biotechnology and pharmaceutical industries. While it is known that protein structure directly impacts protein…

Biomolecules · Quantitative Biology 2020-11-02 Nicolas Swenson , Aditi S. Krishnapriyan , Aydin Buluc , Dmitriy Morozov , Katherine Yelick

Integration of Pre-trained Protein Language Models into Geometric Deep Learning Networks

Geometric deep learning has recently achieved great success in non-Euclidean domains, and learning on 3D structures of large biomolecules is emerging as a distinct research area. However, its efficacy is largely constrained due to the…

Machine Learning · Computer Science 2023-10-31 Fang Wu , Lirong Wu , Dragomir Radev , Jinbo Xu , Stan Z. Li

Learning the Language of Protein Structure

Representation learning and \emph{de novo} generation of proteins are pivotal computational biology tasks. Whilst natural language processing (NLP) techniques have proven highly effective for protein sequence modelling, structure modelling…

Quantitative Methods · Quantitative Biology 2025-01-08 Benoit Gaujac , Jérémie Donà , Liviu Copoiu , Timothy Atkinson , Thomas Pierrot , Thomas D. Barrett

Generative Pretrained Autoregressive Transformer Graph Neural Network applied to the Analysis and Discovery of Novel Proteins

We report a flexible language-model based deep learning strategy, applied here to solve complex forward and inverse problems in protein modeling, based on an attention neural network that integrates transformer and graph convolutional…

Biomolecules · Quantitative Biology 2023-10-20 Markus J. Buehler

Structure-Informed Protein Language Model

Protein language models are a powerful tool for learning protein representations through pre-training on vast protein sequence datasets. However, traditional protein language models lack explicit structural supervision, despite its…

Biomolecules · Quantitative Biology 2024-02-09 Zuobai Zhang , Jiarui Lu , Vijil Chenthamarakshan , Aurélie Lozano , Payel Das , Jian Tang

Evaluating representation learning on the protein structure universe

We introduce ProteinWorkshop, a comprehensive benchmark suite for representation learning on protein structures with Geometric Graph Neural Networks. We consider large-scale pre-training and downstream tasks on both experimental and…

Machine Learning · Computer Science 2024-06-21 Arian R. Jamasb , Alex Morehead , Chaitanya K. Joshi , Zuobai Zhang , Kieran Didi , Simon V. Mathis , Charles Harris , Jian Tang , Jianlin Cheng , Pietro Lio , Tom L. Blundell

Towards Multiscale Graph-based Protein Learning with Geometric Secondary Structural Motifs

Graph neural networks (GNNs) have emerged as powerful tools for learning protein structures by capturing spatial relationships at the residue level. However, existing GNN-based methods often face challenges in learning multiscale…

Machine Learning · Computer Science 2026-02-03 Shih-Hsin Wang , Yuhao Huang , Taos Transue , Justin Baker , Jonathan Forstater , Thomas Strohmer , Bao Wang

Pre-Training Protein Bi-level Representation Through Span Mask Strategy On 3D Protein Chains

In recent years, there has been a surge in the development of 3D structure-based pre-trained protein models, representing a significant advancement over pre-trained protein language models in various downstream tasks. However, most existing…

Machine Learning · Computer Science 2024-06-04 Jiale Zhao , Wanru Zhuang , Jia Song , Yaqi Li , Shuqi Lu

Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information

Bridging the exponentially growing gap between the numbers of unlabeled and labeled protein sequences, several studies adopted semi-supervised learning for protein sequence modeling. In these studies, models were pre-trained with a…

Biomolecules · Quantitative Biology 2021-09-20 Seonwoo Min , Seunghyun Park , Siwon Kim , Hyun-Soo Choi , Byunghan Lee , Sungroh Yoon

Predicting protein variants with equivariant graph neural networks

Pre-trained models have been successful in many protein engineering tasks. Most notably, sequence-based models have achieved state-of-the-art performance on protein fitness prediction while structure-based models have been used…

Machine Learning · Computer Science 2023-07-25 Antonia Boca , Simon Mathis

CCPL: Cross-modal Contrastive Protein Learning

Effective protein representation learning is crucial for predicting protein functions. Traditional methods often pretrain protein language models on large, unlabeled amino acid sequences, followed by finetuning on labeled data. While…

Biomolecules · Quantitative Biology 2024-09-05 Jiangbin Zheng , Stan Z. Li