Related papers: Conditional WaveGAN

Variational Conditional GAN for Fine-grained Controllable Image Generation

In this paper, we propose a novel variational generator framework for conditional GANs to catch semantic details for improving the generation quality and diversity. Traditional generators in conditional GANs simply concatenate the…

Computer Vision and Pattern Recognition · Computer Science 2019-09-24 Mingqi Hu , Deyu Zhou , Yulan He

AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation

In recent years, image generation has shown a great leap in performance, where diffusion models play a central role. Although generating high-quality images, such models are mainly conditioned on textual descriptions. This begs the…

Sound · Computer Science 2023-05-23 Guy Yariv , Itai Gat , Lior Wolf , Yossi Adi , Idan Schwartz

AudioGen: Textually Guided Audio Generation

We tackle the problem of generating audio samples conditioned on descriptive text captions. In this work, we propose AaudioGen, an auto-regressive generative model that generates audio samples conditioned on text inputs. AudioGen operates…

Sound · Computer Science 2023-03-07 Felix Kreuk , Gabriel Synnaeve , Adam Polyak , Uriel Singer , Alexandre Défossez , Jade Copet , Devi Parikh , Yaniv Taigman , Yossi Adi

Adversarial Audio Synthesis

Audio signals are sampled at high temporal resolutions, and learning to synthesize audio requires capturing structure across a range of timescales. Generative adversarial networks (GANs) have seen wide success at generating images that are…

Sound · Computer Science 2019-02-12 Chris Donahue , Julian McAuley , Miller Puckette

Conditional Spoken Digit Generation with StyleGAN

This paper adapts a StyleGAN model for speech generation with minimal or no conditioning on text. StyleGAN is a multi-scale convolutional GAN capable of hierarchically capturing data structure and latent variation on multiple spatial (or…

Audio and Speech Processing · Electrical Eng. & Systems 2020-09-17 Kasperi Palkama , Lauri Juvela , Alexander Ilin

Conditional Generative Modeling via Learning the Latent Space

Although deep learning has achieved appealing results on several machine learning tasks, most of the models are deterministic at inference, limiting their application to single-modal settings. We propose a novel general-purpose framework…

Machine Learning · Computer Science 2020-10-12 Sameera Ramasinghe , Kanchana Ranasinghe , Salman Khan , Nick Barnes , Stephen Gould

GANs Conditioning Methods: A Survey

In recent years, Generative Adversarial Networks (GANs) have seen significant advancements, leading to their widespread adoption across various fields. The original GAN architecture enables the generation of images without any specific…

Machine Learning · Computer Science 2024-09-04 Anis Bourou , Valérie Mezger , Auguste Genovesio

cMelGAN: An Efficient Conditional Generative Model Based on Mel Spectrograms

Analysing music in the field of machine learning is a very difficult problem with numerous constraints to consider. The nature of audio data, with its very high dimensionality and widely varying scales of structure, is one of the primary…

Sound · Computer Science 2022-05-17 Tracy Qian , Jackson Kaunismaa , Tony Chung

Assisted Sound Sample Generation with Musical Conditioning in Adversarial Auto-Encoders

Generative models have thrived in computer vision, enabling unprecedented image processes. Yet the results in audio remain less advanced. Our project targets real-time sound synthesis from a reduced set of high-level parameters, including…

Sound · Computer Science 2019-06-25 Adrien Bitton , Philippe Esling , Antoine Caillon , Martin Fouilleul

On Conditioning the Input Noise for Controlled Image Generation with Diffusion Models

Conditional image generation has paved the way for several breakthroughs in image editing, generating stock photos and 3-D object generation. This continues to be a significant area of interest with the rise of new state-of-the-art methods…

Computer Vision and Pattern Recognition · Computer Science 2022-05-10 Vedant Singh , Surgan Jandial , Ayush Chopra , Siddharth Ramesh , Balaji Krishnamurthy , Vineeth N. Balasubramanian

Generating music with sentiment using Transformer-GANs

The field of Automatic Music Generation has seen significant progress thanks to the advent of Deep Learning. However, most of these results have been produced by unconditional models, which lack the ability to interact with their users, not…

Sound · Computer Science 2022-12-22 Pedro Neves , Jose Fornari , João Florindo

Audio Generation with Multiple Conditional Diffusion Model

Text-based audio generation models have limitations as they cannot encompass all the information in audio, leading to restricted controllability when relying solely on text. To address this issue, we propose a novel model that enhances the…

Sound · Computer Science 2023-12-29 Zhifang Guo , Jianguo Mao , Rui Tao , Long Yan , Kazushige Ouchi , Hong Liu , Xiangdong Wang

Example-Based Framework for Perceptually Guided Audio Texture Generation

Controllable generation using StyleGANs is usually achieved by training the model using labeled data. For audio textures, however, there is currently a lack of large semantically labeled datasets. Therefore, to control generation, we…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-08 Purnima Kamath , Chitralekha Gupta , Lonce Wyse , Suranga Nanayakkara

Conditional Hybrid GAN for Sequence Generation

Conditional sequence generation aims to instruct the generation procedure by conditioning the model with additional context information, which is a self-supervised learning issue (a form of unsupervised learning with supervision information…

Artificial Intelligence · Computer Science 2020-09-21 Yi Yu , Abhishek Srivastava , Rajiv Ratn Shah

A Generative Model for Raw Audio Using Transformer Architectures

This paper proposes a novel way of doing audio synthesis at the waveform level using Transformer architectures. We propose a deep neural network for generating waveforms, similar to wavenet. This is fully probabilistic, auto-regressive, and…

Sound · Computer Science 2021-07-09 Prateek Verma , Chris Chafe

cGANs with Conditional Convolution Layer

Conditional generative adversarial networks (cGANs) have been widely researched to generate class conditional images using a single generator. However, in the conventional cGANs techniques, it is still challenging for the generator to learn…

Computer Vision and Pattern Recognition · Computer Science 2020-04-09 Min-Cheol Sagong , Yong-Goo Shin , Yoon-Jae Yeo , Seung Park , Sung-Jea Ko

Shape-conditioned Image Generation by Learning Latent Appearance Representation from Unpaired Data

Conditional image generation is effective for diverse tasks including training data synthesis for learning-based computer vision. However, despite the recent advances in generative adversarial networks (GANs), it is still a challenging task…

Computer Vision and Pattern Recognition · Computer Science 2018-11-30 Yutaro Miyauchi , Yusuke Sugano , Yasuyuki Matsushita

ICGAN: An implicit conditioning method for interpretable feature control of neural audio synthesis

Neural audio synthesis methods can achieve high-fidelity and realistic sound generation by utilizing deep generative models. Such models typically rely on external labels which are often discrete as conditioning information to achieve…

Sound · Computer Science 2024-06-12 Yunyi Liu , Craig Jin

Conditional Image Synthesis With Auxiliary Classifier GANs

Synthesizing high resolution photorealistic images has been a long-standing challenge in machine learning. In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We…

Machine Learning · Statistics 2017-07-24 Augustus Odena , Christopher Olah , Jonathon Shlens

SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models

We are witnessing a revolution in conditional image synthesis with the recent success of large scale text-to-image generation methods. This success also opens up new opportunities in controlling the generation and editing process using…

Computer Vision and Pattern Recognition · Computer Science 2024-05-03 Burak Can Biner , Farrin Marouf Sofian , Umur Berkay Karakaş , Duygu Ceylan , Erkut Erdem , Aykut Erdem