Signal Processing2021-10-13v1Artificial IntelligenceComputer Vision and Pattern RecognitionGraphicsMachine LearningSoundAudio and Speech ProcessingImage and Video Processing
Generative diffusion processes are an emerging and effective tool for image and speech generation. In the existing methods, the underlying noise distribution of the diffusion process is Gaussian noise. However, fitting distributions with more degrees of freedom could improve the performance of such generative models. In this work, we investigate other types of noise distribution for the diffusion process. Specifically, we introduce the Denoising Diffusion Gamma Model (DDGM) and show that noise from Gamma distribution provides improved results for image and speech generation. Our approach preserves the ability to efficiently sample state in the training diffusion process while using Gamma noise.
@article{arxiv.2110.05948,
title = {Denoising Diffusion Gamma Models},
author = {Eliya Nachmani and Robin San Roman and Lior Wolf},
journal= {arXiv preprint arXiv:2110.05948},
year = {2021}
}
Comments
arXiv admin note: substantial text overlap with arXiv:2106.07582