Related papers: Small Molecule Optimization with Large Language Mo…

mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules

Despite their ability to understand chemical knowledge, large language models (LLMs) remain limited in their capacity to propose novel molecules with desired functions (e.g., drug-like properties). In addition, the molecules that LLMs…

Artificial Intelligence · Computer Science 2026-03-03 Carl Edwards , Chi Han , Gawon Lee , Thao Nguyen , Sara Szymkuć , Chetan Kumar Prasad , Bowen Jin , Jiawei Han , Ying Diao , Ge Liu , Hao Peng , Bartosz A. Grzybowski , Martin D. Burke , Heng Ji

ChemMLLM: Chemical Multimodal Large Language Model

Multimodal large language models (MLLMs) have made impressive progress in many applications in recent years. However, chemical MLLMs that can handle cross-modal understanding and generation remain underexplored. To fill this gap, we propose…

Machine Learning · Computer Science 2025-08-05 Qian Tan , Dongzhan Zhou , Peng Xia , Wanhao Liu , Wanli Ouyang , Lei Bai , Yuqiang Li , Tianfan Fu

DrugLLM: Open Large Language Model for Few-shot Molecule Generation

Large Language Models (LLMs) have made great strides in areas such as language processing and computer vision. Despite the emergence of diverse techniques to improve few-shot learning capacity, current LLMs fall short in handling the…

Biomolecules · Quantitative Biology 2024-05-14 Xianggen Liu , Yan Guo , Haoran Li , Jin Liu , Shudong Huang , Bowen Ke , Jiancheng Lv

A Survey of Large Language Models for Text-Guided Molecular Discovery: from Molecule Generation to Optimization

Large language models (LLMs) are introducing a paradigm shift in molecular discovery by enabling text-guided interaction with chemical spaces through natural language, symbolic notations, with emerging extensions to incorporate multi-modal…

Machine Learning · Computer Science 2025-05-23 Ziqing Wang , Kexin Zhang , Zihan Zhao , Yibo Wen , Abhishek Pandey , Han Liu , Kaize Ding

LICO: Large Language Models for In-Context Molecular Optimization

Optimizing black-box functions is a fundamental problem in science and engineering. To solve this problem, many approaches learn a surrogate function that estimates the underlying objective from limited historical evaluations. Large…

Machine Learning · Computer Science 2025-10-23 Tung Nguyen , Aditya Grover

GeLLMO: Generalizing Large Language Models for Multi-property Molecule Optimization

Despite recent advancements, most computational methods for molecule optimization are constrained to single- or double-property optimization tasks and suffer from poor scalability and generalizability to novel optimization tasks. Meanwhile,…

Machine Learning · Computer Science 2025-05-28 Vishal Dey , Xiao Hu , Xia Ning

Chemical Language Model Linker: blending text and molecules with modular adapters

The development of large language models and multi-modal models has enabled the appealing idea of generating novel molecules from text descriptions. Generative modeling would shift the paradigm from relying on large-scale chemical screening…

Machine Learning · Computer Science 2025-08-25 Yifan Deng , Spencer S. Ericksen , Anthony Gitter

Large Language Models for Controllable Multi-property Multi-objective Molecule Optimization

In real-world drug design, molecule optimization requires selectively improving multiple molecular properties up to pharmaceutically relevant levels, while maintaining others that already meet such criteria. However, existing computational…

Machine Learning · Computer Science 2025-06-02 Vishal Dey , Xiao Hu , Xia Ning

ChatMol: A Versatile Molecule Designer Based on the Numerically Enhanced Large Language Model

Goal-oriented de novo molecule design, namely generating molecules with specific property or substructure constraints, is a crucial yet challenging task in drug discovery. Existing methods, such as Bayesian optimization and reinforcement…

Computational Engineering, Finance, and Science · Computer Science 2025-02-28 Chuanliu Fan , Ziqiang Cao , Zicheng Ma , Nan Yu , Yimin Peng , Jun Zhang , Yiqin Gao , Guohong Fu

Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model

While various models and computational tools have been proposed for structure and property analysis of molecules, generating molecules that conform to all desired structures and properties remains a challenge. Here, we introduce a…

Computation and Language · Computer Science 2024-10-11 Peng Zhou , Jianmin Wang , Chunyan Li , Zixu Wang , Yiping Liu , Siqi Sun , Jianxin Lin , Leyi Wei , Xibao Cai , Houtim Lai , Wei Liu , Longyue Wang , Yuansheng Liu , Xiangxiang Zeng

DrugAssist: A Large Language Model for Molecule Optimization

Recently, the impressive performance of large language models (LLMs) on a wide range of tasks has attracted an increasing number of attempts to apply LLMs in drug discovery. However, molecule optimization, a critical task in the drug…

Quantitative Methods · Quantitative Biology 2024-01-22 Geyan Ye , Xibao Cai , Houtim Lai , Xing Wang , Junhong Huang , Longyue Wang , Wei Liu , Xiangxiang Zeng

MolLIBRA: Genetic Molecular Optimization with Multi-Fingerprint Surrogates and Text-Molecule Aligned Critic

We study sample-efficient molecular optimization under a limited budget of oracle evaluations. We propose MolLIBRA (MultimOdaLity and Language Integrated Bayesian and evolutionaRy optimizAtion), a genetic algorithm based framework that…

Neural and Evolutionary Computing · Computer Science 2026-02-10 Masahi Okada , Kazuki Sakai , Hiroaki Yoshida , Masaki Okoshi , Tadahiro Taniguchi

ControllableGPT: A Ground-Up Designed Controllable GPT for Molecule Optimization

Large Language Models (LLMs) employ three popular training approaches: Masked Language Models (MLM), Causal Language Models (CLM), and Sequence-to-Sequence Models (seq2seq). However, each approach has its strengths and limitations, and…

Machine Learning · Computer Science 2025-02-18 Xuefeng Liu , Songhao Jiang , Bo Li , Rick Stevens

Controlled Molecule Generator for Optimizing Multiple Chemical Properties

Generating a novel and optimized molecule with desired chemical properties is an essential part of the drug discovery process. Failure to meet one of the required properties can frequently lead to failure in a clinical test which is costly.…

Machine Learning · Computer Science 2020-10-28 Bonggun Shin , Sungsoo Park , JinYeong Bak , Joyce C. Ho

SMolLM: Small Language Models Learn Small Molecular Grammar

Language models for molecular design have scaled to hundreds of millions of parameters, yet how they learn chemical grammar is poorly understood. We train SMolLM, a 53K-parameter weight-shared transformer, to generate novel SMILES with 95%…

Machine Learning · Computer Science 2026-05-29 Akhil Jindal , Harang Ju

Emerging Opportunities of Using Large Language Models for Translation Between Drug Molecules and Indications

A drug molecule is a substance that changes the organism's mental or physical state. Every approved drug has an indication, which refers to the therapeutic use of that drug for treating a particular medical condition. While the Large…

Artificial Intelligence · Computer Science 2024-02-20 David Oniani , Jordan Hilsman , Chengxi Zang , Junmei Wang , Lianjin Cai , Jan Zawala , Yanshan Wang

SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent

Optimizing the structure of molecules to achieve desired properties is a central bottleneck across the chemical sciences, particularly in the pharmaceutical industry where it underlies the discovery of new drugs. Since molecular property…

Artificial Intelligence · Computer Science 2026-02-19 Fabian P. Krüger , Andrea Hunklinger , Adrian Wolny , Tim J. Adler , Igor Tetko , Santiago David Villalba

MT-Mol:Multi Agent System with Tool-based Reasoning for Molecular Optimization

Large language models (LLMs) have large potential for molecular optimization, as they can gather external chemistry tools and enable collaborative interactions to iteratively refine molecular candidates. However, this potential remains…

Artificial Intelligence · Computer Science 2025-05-28 Hyomin Kim , Yunhui Jang , Sungsoo Ahn

Evaluating the Progression of Large Language Model Capabilities for Small-Molecule Drug Design

Large Language Models (LLMs) have the potential to accelerate small molecule drug design due to their ability to reason about information from diverse sources and formats. However, their practical utility remains unclear due to the lack of…

Machine Learning · Computer Science 2026-04-20 Shriram Chennakesavalu , Kirill Shmilovich , Hayley Weir , Colin Grambow , John Bradshaw , Patricia Suriana , Chen Cheng , Kangway Chuang

Design Proteins Using Large Language Models: Enhancements and Comparative Analyses

Pre-trained LLMs have demonstrated substantial capabilities across a range of conventional natural language processing (NLP) tasks, such as summarization and entity recognition. In this paper, we explore the application of LLMs in the…

Quantitative Methods · Quantitative Biology 2024-08-14 Kamyar Zeinalipour , Neda Jamshidi , Monica Bianchini , Marco Maggini , Marco Gori