Related papers: Coder as Editor: Code-driven Interpretable Molecul…

MoleCode unlocks structural intelligence in large language models

Molecules are graphs, but large language models~(LLMs) are usually asked to reason about them through linear strings. The most popular molecular representation, SMILES, compresses atoms, bonds, branches and rings into a compact sequence in…

Biomolecules · Quantitative Biology 2026-05-19 Zhiyuan Yan , Chen Liu , Boxuan Zhao , Kaiqing Lin , Jixiang Zhao , Yimi Wang , Liuzhenghao Lv , Hao Li , Shanzhuo Zhang , Li Yuan , Fanyang Mo

Deciphering Scientific Reasoning Steps from Outcome Data for Molecule Optimization

Emerging reasoning models hold promise for automating scientific discovery. However, their training is hindered by a critical supervision gap: experimental outcomes are abundant, whereas intermediate reasoning steps are rarely documented at…

Biomolecules · Quantitative Biology 2026-03-24 Zequn Liu , Kehan Wu , Shufang Xie , Zekun Guo , Wei Zhang , Tao Qin , Renhe Liu , Yingce Xia

Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization

Molecular optimization is a fundamental goal in the chemical sciences and is of central interest to drug and material design. In recent years, significant progress has been made in solving challenging problems across various aspects of…

Computational Engineering, Finance, and Science · Computer Science 2022-10-11 Wenhao Gao , Tianfan Fu , Jimeng Sun , Connor W. Coley

ECCO: Evidence-Driven Causal Reasoning for Compiler Optimization

Compiler auto-tuning faces a dichotomy between traditional black-box search methods, which lack semantic guidance, and recent Large Language Model (LLM) approaches, which often suffer from superficial pattern matching and causal opacity. In…

Machine Learning · Computer Science 2026-02-03 Haolin Pan , Lianghong Huang , Jinyuan Dong , Mingjie Xing , Yanjun Wu

DrugR: Optimizing Molecular Drugs through LLM-based Explicit Reasoning

Molecule generation and optimization is a fundamental task in chemical domain. The rapid development of intelligent tools, especially large language models (LLMs) with powerful knowledge reserves and interactive capabilities, has provided…

Machine Learning · Computer Science 2026-02-10 Haoran Liu , Zheni Zeng , Yukun Yan , Yuxuan Chen , Yunduo Xiao

NEMO: Execution-Aware Optimization Modeling via Autonomous Coding Agents

In this paper, we present NEMO, a system that translates Natural-language descriptions of decision problems into formal Executable Mathematical Optimization implementations, operating collaboratively with users or autonomously. Existing…

Artificial Intelligence · Computer Science 2026-01-30 Yang Song , Anoushka Vyas , Zirui Wei , Sina Khoshfetrat Pakazad , Henrik Ohlsson , Graham Neubig

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models

Algorithmic reasoning refers to the ability to understand the complex patterns behind the problem and decompose them into a sequence of reasoning steps towards the solution. Such nature of algorithmic reasoning makes it a challenge for…

Computation and Language · Computer Science 2024-04-04 Hyungjoo Chae , Yeonghyeon Kim , Seungone Kim , Kai Tzu-iunn Ong , Beong-woo Kwak , Moohyeon Kim , Seonghwan Kim , Taeyoon Kwon , Jiwan Chung , Youngjae Yu , Jinyoung Yeo

MT-Mol:Multi Agent System with Tool-based Reasoning for Molecular Optimization

Large language models (LLMs) have large potential for molecular optimization, as they can gather external chemistry tools and enable collaborative interactions to iteratively refine molecular candidates. However, this potential remains…

Artificial Intelligence · Computer Science 2025-05-28 Hyomin Kim , Yunhui Jang , Sungsoo Ahn

mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules

Despite their ability to understand chemical knowledge, large language models (LLMs) remain limited in their capacity to propose novel molecules with desired functions (e.g., drug-like properties). In addition, the molecules that LLMs…

Artificial Intelligence · Computer Science 2026-03-03 Carl Edwards , Chi Han , Gawon Lee , Thao Nguyen , Sara Szymkuć , Chetan Kumar Prasad , Bowen Jin , Jiawei Han , Ying Diao , Ge Liu , Hao Peng , Bartosz A. Grzybowski , Martin D. Burke , Heng Ji

MolX: Enhancing Large Language Models for Molecular Understanding With A Multi-Modal Extension

Large Language Models (LLMs) with their strong task-handling capabilities have shown remarkable advancements across a spectrum of fields, moving beyond natural language understanding. However, their proficiency within the chemistry domain…

Computer Vision and Pattern Recognition · Computer Science 2026-02-02 Khiem Le , Zhichun Guo , Kaiwen Dong , Xiaobao Huang , Bozhao Nan , Roshni Iyer , Xiangliang Zhang , Olaf Wiest , Wei Wang , Ting Hua , Nitesh V. Chawla

SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent

Optimizing the structure of molecules to achieve desired properties is a central bottleneck across the chemical sciences, particularly in the pharmaceutical industry where it underlies the discovery of new drugs. Since molecular property…

Artificial Intelligence · Computer Science 2026-02-19 Fabian P. Krüger , Andrea Hunklinger , Adrian Wolny , Tim J. Adler , Igor Tetko , Santiago David Villalba

ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?

Although large language models (LLMs) have been largely successful in generating functionally correct programs, conditioning models to produce efficient solutions while ensuring correctness remains a challenge. Further, unreliability in…

Computation and Language · Computer Science 2024-10-11 Siddhant Waghjale , Vishruth Veerendranath , Zora Zhiruo Wang , Daniel Fried

LEGO-Compiler: Enhancing Neural Compilation Through Translation Composability

Large language models (LLMs) have the potential to revolutionize how we design and implement compilers and code translation tools. However, existing LLMs struggle to handle long and complex programs. We introduce LEGO-Compiler, a novel…

Programming Languages · Computer Science 2025-05-28 Shuoming Zhang , Jiacheng Zhao , Chunwei Xia , Zheng Wang , Yunji Chen , Xiaobing Feng , Huimin Cui

DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization

Finetuning a Large Language Model (LLM) is crucial for generating results towards specific objectives. This research delves into the realm of drug optimization and introduce a novel reinforcement learning algorithm to finetune a drug…

Machine Learning · Computer Science 2025-02-12 Xuefeng Liu , Songhao Jiang , Siyu Chen , Zhuoran Yang , Yuxin Chen , Ian Foster , Rick Stevens

Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations

While large language models (LLMs) with Chain-of-Thought (CoT) reasoning excel in mathematics and coding, their potential for systematic reasoning in chemistry, a domain demanding rigorous structural analysis for real-world tasks like drug…

Artificial Intelligence · Computer Science 2026-01-08 Hao Li , He Cao , Bin Feng , Yanjun Shao , Xiangru Tang , Zhiyuan Yan , Li Yuan , Yonghong Tian , Yu Li

MolReasoner: Toward Effective and Interpretable Reasoning for Molecular LLMs

Large Language Models (LLMs) have shown impressive performance across various domains, but their ability to perform molecular reasoning remains underexplored. Existing methods mostly rely on general-purpose prompting, which lacks…

Machine Learning · Computer Science 2026-02-24 Guojiang Zhao , Zixiang Lu , Yutang Ge , Sihang Li , Zheng Cheng , Haitao Lin , Lirong Wu , Hanchen Xia , Hengxing Cai , Wentao Guo , Hongshuai Wang , Mingjun Xu , Siyu Zhu , Guolin Ke , Linfeng Zhang , Zhifeng Gao

MolLingo: Molecule-Native Representations for LLM-Powered Scientific Agents

We present MolLingo, a multi-agent system that emulates the reasoning process of a chemist to automate molecular design. Existing LLM-based approaches either operate as standalone generative models without access to external tools or lack…

Artificial Intelligence · Computer Science 2026-05-28 Thao Nguyen , Heng Ji

From Stochastic Answers to Verifiable Reasoning: Interpretable Decision-Making with LLM-Generated Code

Large language models (LLMs) are increasingly used for high-stakes decision-making, yet existing approaches struggle to reconcile scalability, interpretability, and reproducibility. Black-box models obscure their reasoning, while recent…

Machine Learning · Computer Science 2026-03-17 Anirudh Jaidev Mahesh , Ben Griffin , Fuat Alican , Joseph Ternasky , Zakari Salifu , Kelvin Amoaba , Yagiz Ihlamur , Aaron Ontoyin Yin , Aikins Laryea , Afriyie Samuel , Yigit Ihlamur

OncoReason: Structuring Clinical Reasoning in LLMs for Robust and Interpretable Survival Prediction

Predicting cancer treatment outcomes requires models that are both accurate and interpretable, particularly in the presence of heterogeneous clinical data. While large language models (LLMs) have shown strong performance in biomedical NLP,…

Computation and Language · Computer Science 2025-10-21 Raghu Vamshi Hemadri , Geetha Krishna Guruju , Kristi Topollai , Anna Ewa Choromanska

Efficient Evolutionary Search Over Chemical Space with Large Language Models

Molecular discovery, when formulated as an optimization problem, presents significant computational challenges because optimization objectives can be non-differentiable. Evolutionary Algorithms (EAs), often used to optimize black-box…

Neural and Evolutionary Computing · Computer Science 2025-03-10 Haorui Wang , Marta Skreta , Cher-Tian Ser , Wenhao Gao , Lingkai Kong , Felix Strieth-Kalthoff , Chenru Duan , Yuchen Zhuang , Yue Yu , Yanqiao Zhu , Yuanqi Du , Alán Aspuru-Guzik , Kirill Neklyudov , Chao Zhang