English
Related papers

Related papers: Tokenizing 3D Molecule Structure with Quantized Sp…

200 papers

The integration of molecular and natural language representations has emerged as a focal point in molecular science, with recent advancements in Language Models (LMs) demonstrating significant potential for comprehensive modeling of both…

Biomolecules · Quantitative Biology 2025-03-19 Qizhi Pei , Rui Yan , Kaiyuan Gao , Jinhua Zhu , Lijun Wu

We consider molecule generation in 3D space using language models (LMs), which requires discrete tokenization of 3D molecular geometries. Although tokenization of molecular graphs exists, that for 3D geometries is largely unexplored. Here,…

Artificial Intelligence · Computer Science 2024-08-20 Xiner Li , Limei Wang , Youzhi Luo , Carl Edwards , Shurui Gui , Yuchao Lin , Heng Ji , Shuiwang Ji

Generative models for molecules based on sequential line notation (e.g. SMILES) or graph representation have attracted an increasing interest in the field of structure-based drug design, but they struggle to capture important 3D spatial…

Machine Learning · Computer Science 2023-12-12 Wei Feng , Lvwei Wang , Zaiyun Lin , Yanhao Zhu , Han Wang , Jianqiang Dong , Rong Bai , Huting Wang , Jielong Zhou , Wei Peng , Bo Huang , Wenbiao Zhou

Is there a foreign language describing protein sequences and structures simultaneously? Protein structures, represented by continuous 3D points, have long posed a challenge due to the contrasting modeling paradigms of discrete sequences. We…

Biomolecules · Quantitative Biology 2024-03-20 Zhangyang Gao , Cheng Tan , Jue Wang , Yufei Huang , Lirong Wu , Stan Z. Li

Language Models (LMs) have greatly influenced diverse domains. However, their inherent limitation in comprehending 3D molecular structures has considerably constrained their potential in the biomolecular domain. To bridge this gap, we focus…

Machine Learning · Computer Science 2024-03-19 Sihang Li , Zhiyuan Liu , Yanchen Luo , Xiang Wang , Xiangnan He , Kenji Kawaguchi , Tat-Seng Chua , Qi Tian

In the real world, a molecule is a 3D geometric structure. Compared to 1D SMILES sequences and 2D molecular graphs, 3D molecules represent the most informative molecular modality. Despite the rapid progress of autoregressive-based language…

Computational Engineering, Finance, and Science · Computer Science 2025-08-15 Lei Jiang , Shuzhou Sun , Biqing Qi , Yuchen Fu , Xiaohua Xu , Yuqiang Li , Dongzhan Zhou , Tianfan Fu

Significant interests have recently risen in leveraging sequence-based large language models (LLMs) for drug design. However, most current applications of LLMs in drug discovery lack the ability to comprehend three-dimensional (3D)…

Computational methods that operate on three-dimensional molecular structure have the potential to solve important questions in biology and chemistry. In particular, deep neural networks have gained significant attention, but their…

Molecular representation learning has become a central approach in AI-driven drug discovery, yet existing molecular tokenizations such as SMILES remain largely syntactic and do not naturally align with chemically meaningful substructures.…

Machine Learning · Computer Science 2026-05-19 Takayuki Kimura

Effectively representing 3D scenes for Multimodal Large Language Models (MLLMs) is crucial yet challenging. Existing approaches commonly only rely on 2D image features and use varied tokenization approaches. This work presents a rigorous…

Computer Vision and Pattern Recognition · Computer Science 2025-06-09 Hugues Thomas , Chen Chen , Jian Zhang

Generative modeling of three-dimensional (3D) molecules is a fundamental yet challenging problem in drug discovery and materials science. Existing approaches typically represent molecules as 3D graphs and co-generate discrete atom types…

Machine Learning · Statistics 2026-03-16 Yuchen Hua , Xingang Peng , Jianzhu Ma , Muhan Zhang

Molecular representation pretraining is critical in various applications for drug and material discovery due to the limited number of labeled molecules, and most existing work focuses on pretraining on 2D molecular graphs. However, the…

Machine Learning · Computer Science 2023-03-02 Shengchao Liu , Hongyu Guo , Jian Tang

Generating precise 3D molecular geometries is crucial for drug discovery and material science. While prior efforts leverage 1D representations like SELFIES to ensure molecular validity, they fail to fully exploit the rich chemical knowledge…

Machine Learning · Computer Science 2025-12-15 Zhanpeng Chen , Weihao Gao , Shunyu Wang , Yanan Zhu , Hong Meng , Yuexian Zou

The integration of deep learning, particularly AI-Generated Content, with high-quality data derived from ab initio calculations has emerged as a promising avenue for transforming the landscape of scientific research. However, the challenge…

Machine Learning · Computer Science 2024-12-11 Kaiwei Zhang , Yange Lin , Guangcheng Wu , Yuxiang Ren , Xuecang Zhang , Bo wang , Xiaoyu Zhang , Weitao Du

With the recent advances in machine learning for quantum chemistry, it is now possible to predict the chemical properties of compounds and to generate novel molecules. Existing generative models mostly use a string- or graph-based…

Biomolecules · Quantitative Biology 2020-10-14 Vitali Nesterov , Mario Wieser , Volker Roth

Designing molecules with desirable physiochemical properties and functionalities is a long-standing challenge in chemistry, material science, and drug discovery. Recently, machine learning-based generative models have emerged as promising…

Biomolecules · Quantitative Biology 2023-04-26 Zaixi Zhang , Qi Liu , Chee-Kong Lee , Chang-Yu Hsieh , Enhong Chen

Three-dimensional molecular structure generation is typically performed at the level of individual atoms, yet molecular graph generation techniques often consider fragments as their structural units. Building on the advances in frame-based…

Machine Learning · Computer Science 2026-01-26 Roman Poletukhin , Marcel Kollovieh , Eike Eberhard , Stephan Günnemann

Structure-based drug design aims at generating high affinity ligands with prior knowledge of 3D target structures. Existing methods either use conditional generative model to learn the distribution of 3D ligands given target binding sites,…

Biomolecules · Quantitative Biology 2024-03-18 Yuwei Yang , Siqi Ouyang , Xueyu Hu , Mingyue Zheng , Hao Zhou , Lei Li

Geometric representation-conditioned molecule generation provides an effective paradigm that decouples molecule representation modeling from structure generation. By decoupling molecule generation into two stages-first generating a…

Machine Learning · Computer Science 2026-05-11 Shaoheng Yan , Zian Li , Cai Zhou , Qiaojing Huang , Kai Liu , Muhan Zhang

Structure-based drug design (SBDD) is crucial for developing specific and effective therapeutics against protein targets but remains challenging due to complex protein-ligand interactions and vast chemical space. Although language models…

Biomolecules · Quantitative Biology 2024-08-20 Cong Fu , Xiner Li , Blake Olson , Heng Ji , Shuiwang Ji
‹ Prev 1 2 3 10 Next ›