Enhancing Gradient-based Discrete Sampling via Parallel Tempering

Luxu Liang; Yuhang Jia; Feng Zhou

Enhancing Gradient-based Discrete Sampling via Parallel Tempering

Machine Learning 2025-05-21 v2 Machine Learning Applications

Authors: Luxu Liang , Yuhang Jia , Feng Zhou

Abstract

While gradient-based discrete samplers are effective in sampling from complex distributions, they are susceptible to getting trapped in local minima, particularly in high-dimensional, multimodal discrete distributions, owing to the discontinuities inherent in these landscapes. To circumvent this issue, we combine parallel tempering, also known as replica exchange, with the discrete Langevin proposal and develop the Parallel Tempering enhanced Discrete Langevin Proposal (PTDLP), which are simulated at a series of temperatures. Significant energy differences prompt sample swaps, which are governed by a Metropolis criterion specifically designed for discrete sampling to ensure detailed balance is maintained. Additionally, we introduce an automatic scheme to determine the optimal temperature schedule and the number of chains, ensuring adaptability across diverse tasks with minimal tuning. Theoretically, we establish that our algorithm converges non-asymptotically to the target energy and exhibits faster mixing compared to a single chain. Empirical results further emphasize the superiority of our method in sampling from complex, multimodal discrete distributions, including synthetic problems, restricted Boltzmann machines, and deep energy-based models.

Keywords

density estimation and sampling markov chain monte carlo sampling algorithms

Cite

@article{arxiv.2502.19240,
  title  = {Enhancing Gradient-based Discrete Sampling via Parallel Tempering},
  author = {Luxu Liang and Yuhang Jia and Feng Zhou},
  journal= {arXiv preprint arXiv:2502.19240},
  year   = {2025}
}

Comments

25 pages, 5 figures. arXiv admin note: text overlap with arXiv:2402.17699 by other authors

Enhancing Gradient-based Discrete Sampling via Parallel Tempering

Abstract

Keywords

Cite

Comments

Related papers