English
Related papers

Related papers: Conditional WaveGAN

200 papers

In this paper, we propose a novel variational generator framework for conditional GANs to catch semantic details for improving the generation quality and diversity. Traditional generators in conditional GANs simply concatenate the…

Computer Vision and Pattern Recognition · Computer Science 2019-09-24 Mingqi Hu , Deyu Zhou , Yulan He

In recent years, image generation has shown a great leap in performance, where diffusion models play a central role. Although generating high-quality images, such models are mainly conditioned on textual descriptions. This begs the…

Sound · Computer Science 2023-05-23 Guy Yariv , Itai Gat , Lior Wolf , Yossi Adi , Idan Schwartz

We tackle the problem of generating audio samples conditioned on descriptive text captions. In this work, we propose AaudioGen, an auto-regressive generative model that generates audio samples conditioned on text inputs. AudioGen operates…

Audio signals are sampled at high temporal resolutions, and learning to synthesize audio requires capturing structure across a range of timescales. Generative adversarial networks (GANs) have seen wide success at generating images that are…

Sound · Computer Science 2019-02-12 Chris Donahue , Julian McAuley , Miller Puckette

This paper adapts a StyleGAN model for speech generation with minimal or no conditioning on text. StyleGAN is a multi-scale convolutional GAN capable of hierarchically capturing data structure and latent variation on multiple spatial (or…

Audio and Speech Processing · Electrical Eng. & Systems 2020-09-17 Kasperi Palkama , Lauri Juvela , Alexander Ilin

Although deep learning has achieved appealing results on several machine learning tasks, most of the models are deterministic at inference, limiting their application to single-modal settings. We propose a novel general-purpose framework…

Machine Learning · Computer Science 2020-10-12 Sameera Ramasinghe , Kanchana Ranasinghe , Salman Khan , Nick Barnes , Stephen Gould

In recent years, Generative Adversarial Networks (GANs) have seen significant advancements, leading to their widespread adoption across various fields. The original GAN architecture enables the generation of images without any specific…

Machine Learning · Computer Science 2024-09-04 Anis Bourou , Valérie Mezger , Auguste Genovesio

Analysing music in the field of machine learning is a very difficult problem with numerous constraints to consider. The nature of audio data, with its very high dimensionality and widely varying scales of structure, is one of the primary…

Sound · Computer Science 2022-05-17 Tracy Qian , Jackson Kaunismaa , Tony Chung

Generative models have thrived in computer vision, enabling unprecedented image processes. Yet the results in audio remain less advanced. Our project targets real-time sound synthesis from a reduced set of high-level parameters, including…

Sound · Computer Science 2019-06-25 Adrien Bitton , Philippe Esling , Antoine Caillon , Martin Fouilleul

Conditional image generation has paved the way for several breakthroughs in image editing, generating stock photos and 3-D object generation. This continues to be a significant area of interest with the rise of new state-of-the-art methods…

Computer Vision and Pattern Recognition · Computer Science 2022-05-10 Vedant Singh , Surgan Jandial , Ayush Chopra , Siddharth Ramesh , Balaji Krishnamurthy , Vineeth N. Balasubramanian

The field of Automatic Music Generation has seen significant progress thanks to the advent of Deep Learning. However, most of these results have been produced by unconditional models, which lack the ability to interact with their users, not…

Sound · Computer Science 2022-12-22 Pedro Neves , Jose Fornari , João Florindo

Text-based audio generation models have limitations as they cannot encompass all the information in audio, leading to restricted controllability when relying solely on text. To address this issue, we propose a novel model that enhances the…

Sound · Computer Science 2023-12-29 Zhifang Guo , Jianguo Mao , Rui Tao , Long Yan , Kazushige Ouchi , Hong Liu , Xiangdong Wang

Controllable generation using StyleGANs is usually achieved by training the model using labeled data. For audio textures, however, there is currently a lack of large semantically labeled datasets. Therefore, to control generation, we…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-08 Purnima Kamath , Chitralekha Gupta , Lonce Wyse , Suranga Nanayakkara

Conditional sequence generation aims to instruct the generation procedure by conditioning the model with additional context information, which is a self-supervised learning issue (a form of unsupervised learning with supervision information…

Artificial Intelligence · Computer Science 2020-09-21 Yi Yu , Abhishek Srivastava , Rajiv Ratn Shah

This paper proposes a novel way of doing audio synthesis at the waveform level using Transformer architectures. We propose a deep neural network for generating waveforms, similar to wavenet. This is fully probabilistic, auto-regressive, and…

Sound · Computer Science 2021-07-09 Prateek Verma , Chris Chafe

Conditional generative adversarial networks (cGANs) have been widely researched to generate class conditional images using a single generator. However, in the conventional cGANs techniques, it is still challenging for the generator to learn…

Computer Vision and Pattern Recognition · Computer Science 2020-04-09 Min-Cheol Sagong , Yong-Goo Shin , Yoon-Jae Yeo , Seung Park , Sung-Jea Ko

Conditional image generation is effective for diverse tasks including training data synthesis for learning-based computer vision. However, despite the recent advances in generative adversarial networks (GANs), it is still a challenging task…

Computer Vision and Pattern Recognition · Computer Science 2018-11-30 Yutaro Miyauchi , Yusuke Sugano , Yasuyuki Matsushita

Neural audio synthesis methods can achieve high-fidelity and realistic sound generation by utilizing deep generative models. Such models typically rely on external labels which are often discrete as conditioning information to achieve…

Sound · Computer Science 2024-06-12 Yunyi Liu , Craig Jin

Synthesizing high resolution photorealistic images has been a long-standing challenge in machine learning. In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We…

Machine Learning · Statistics 2017-07-24 Augustus Odena , Christopher Olah , Jonathon Shlens

We are witnessing a revolution in conditional image synthesis with the recent success of large scale text-to-image generation methods. This success also opens up new opportunities in controlling the generation and editing process using…

Computer Vision and Pattern Recognition · Computer Science 2024-05-03 Burak Can Biner , Farrin Marouf Sofian , Umur Berkay Karakaş , Duygu Ceylan , Erkut Erdem , Aykut Erdem
‹ Prev 1 2 3 10 Next ›