English

Diffusion-Based Symbolic Regression

Machine Learning 2025-06-02 v1

Abstract

Diffusion has emerged as a powerful framework for generative modeling, achieving remarkable success in applications such as image and audio synthesis. Enlightened by this progress, we propose a novel diffusion-based approach for symbolic regression. We construct a random mask-based diffusion and denoising process to generate diverse and high-quality equations. We integrate this generative processes with a token-wise Group Relative Policy Optimization (GRPO) method to conduct efficient reinforcement learning on the given measurement dataset. In addition, we introduce a long short-term risk-seeking policy to expand the pool of top-performing candidates, further enhancing performance. Extensive experiments and ablation studies have demonstrated the effectiveness of our approach.

Keywords

Cite

@article{arxiv.2505.24776,
  title  = {Diffusion-Based Symbolic Regression},
  author = {Zachary Bastiani and Robert M. Kirby and Jacob Hochhalter and Shandian Zhe},
  journal= {arXiv preprint arXiv:2505.24776},
  year   = {2025}
}