Enhancing Gradient-based Discrete Sampling via Parallel Tempering
Abstract
While gradient-based discrete samplers are effective in sampling from complex distributions, they are susceptible to getting trapped in local minima, particularly in high-dimensional, multimodal discrete distributions, owing to the discontinuities inherent in these landscapes. To circumvent this issue, we combine parallel tempering, also known as replica exchange, with the discrete Langevin proposal and develop the Parallel Tempering enhanced Discrete Langevin Proposal (PTDLP), which are simulated at a series of temperatures. Significant energy differences prompt sample swaps, which are governed by a Metropolis criterion specifically designed for discrete sampling to ensure detailed balance is maintained. Additionally, we introduce an automatic scheme to determine the optimal temperature schedule and the number of chains, ensuring adaptability across diverse tasks with minimal tuning. Theoretically, we establish that our algorithm converges non-asymptotically to the target energy and exhibits faster mixing compared to a single chain. Empirical results further emphasize the superiority of our method in sampling from complex, multimodal discrete distributions, including synthetic problems, restricted Boltzmann machines, and deep energy-based models.
Cite
@article{arxiv.2502.19240,
title = {Enhancing Gradient-based Discrete Sampling via Parallel Tempering},
author = {Luxu Liang and Yuhang Jia and Feng Zhou},
journal= {arXiv preprint arXiv:2502.19240},
year = {2025}
}
Comments
25 pages, 5 figures. arXiv admin note: text overlap with arXiv:2402.17699 by other authors