A First-order Generative Bilevel Optimization Framework for Diffusion Models

Quan Xiao; Hui Yuan; A F M Saif; Gaowen Liu; Ramana Kompella; Mengdi Wang; Tianyi Chen

A First-order Generative Bilevel Optimization Framework for Diffusion Models

Machine Learning 2025-08-06 v2 Optimization and Control Machine Learning

Authors: Quan Xiao , Hui Yuan , A F M Saif , Gaowen Liu , Ramana Kompella , Mengdi Wang , Tianyi Chen

Abstract

Diffusion models, which iteratively denoise data samples to synthesize high-quality outputs, have achieved empirical success across domains. However, optimizing these models for downstream tasks often involves nested bilevel structures, such as tuning hyperparameters for fine-tuning tasks or noise schedules in training dynamics, where traditional bilevel methods fail due to the infinite-dimensional probability space and prohibitive sampling costs. We formalize this challenge as a generative bilevel optimization problem and address two key scenarios: (1) fine-tuning pre-trained models via an inference-only lower-level solver paired with a sample-efficient gradient estimator for the upper level, and (2) training diffusion model from scratch with noise schedule optimization by reparameterizing the lower-level problem and designing a computationally tractable gradient estimator. Our first-order bilevel framework overcomes the incompatibility of conventional bilevel methods with diffusion processes, offering theoretical grounding and computational practicality. Experiments demonstrate that our method outperforms existing fine-tuning and hyperparameter search baselines.

Keywords

diffusion model hyperparameter optimization flow matching

Cite

@article{arxiv.2502.08808,
  title  = {A First-order Generative Bilevel Optimization Framework for Diffusion Models},
  author = {Quan Xiao and Hui Yuan and A F M Saif and Gaowen Liu and Ramana Kompella and Mengdi Wang and Tianyi Chen},
  journal= {arXiv preprint arXiv:2502.08808},
  year   = {2025}
}

Comments

Cameral-ready version: added experiments using the HPSv2 reward, improved notation consistency for the diffusion model, and added related works

A First-order Generative Bilevel Optimization Framework for Diffusion Models

Abstract

Keywords

Cite

Comments

Related papers