Related papers: Wavespace: A Highly Explorable Wavetable Generator

Neural Wavetable: a playable wavetable synthesizer using neural networks

We present Neural Wavetable, a proof-of-concept wavetable synthesizer that uses neural networks to generate playable wavetables. The system can produce new, distinct waveforms through the interpolation of traditional wavetables in an…

Sound · Computer Science 2018-11-20 Lamtharn Hantrakul , Li-Chia Yang

WavFlow: Audio Generation in Waveform Space

Modern audio generation predominantly relies on latent-space compression, introducing additional complexity and potential information loss. In this work, we challenge this paradigm with WavFlow, a framework that generates high-fidelity…

Sound · Computer Science 2026-05-19 Feiyan Zhou , Luyuan Wang , Shoufa Chen , Zhe Wang , Zhiheng Liu , Yuren Cong , Xiaohui Zhang , Fanny Yang , Belinda Zeng

A Generative Model for Raw Audio Using Transformer Architectures

This paper proposes a novel way of doing audio synthesis at the waveform level using Transformer architectures. We propose a deep neural network for generating waveforms, similar to wavenet. This is fully probabilistic, auto-regressive, and…

Sound · Computer Science 2021-07-09 Prateek Verma , Chris Chafe

Assisted Sound Sample Generation with Musical Conditioning in Adversarial Auto-Encoders

Generative models have thrived in computer vision, enabling unprecedented image processes. Yet the results in audio remain less advanced. Our project targets real-time sound synthesis from a reduced set of high-level parameters, including…

Sound · Computer Science 2019-06-25 Adrien Bitton , Philippe Esling , Antoine Caillon , Martin Fouilleul

Conditional WaveGAN

Generative models are successfully used for image synthesis in the recent years. But when it comes to other modalities like audio, text etc little progress has been made. Recent works focus on generating audio from a generative model in an…

Computer Vision and Pattern Recognition · Computer Science 2018-09-30 Chae Young Lee , Anoop Toffy , Gue Jun Jung , Woo-Jin Han

WaveNet: A Generative Model for Raw Audio

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones;…

Sound · Computer Science 2016-09-20 Aaron van den Oord , Sander Dieleman , Heiga Zen , Karen Simonyan , Oriol Vinyals , Alex Graves , Nal Kalchbrenner , Andrew Senior , Koray Kavukcuoglu

Universal audio synthesizer control with normalizing flows

The ubiquity of sound synthesizers has reshaped music production and even entirely defined new music genres. However, the increasing complexity and number of parameters in modern synthesizers make them harder to master. Hence, the…

Machine Learning · Computer Science 2019-07-03 Philippe Esling , Naotake Masuda , Adrien Bardet , Romeo Despres , Axel Chemla--Romeu-Santos

Timbre latent space: exploration and creative aspects

Recent studies show the ability of unsupervised models to learn invertible audio representations using Auto-Encoders. They enable high-quality sound synthesis but a limited control since the latent spaces do not disentangle timbre…

Sound · Computer Science 2020-08-18 Antoine Caillon , Adrien Bitton , Brice Gatinet , Philippe Esling

Differentiable Wavetable Synthesis

Differentiable Wavetable Synthesis (DWTS) is a technique for neural audio synthesis which learns a dictionary of one-period waveforms i.e. wavetables, through end-to-end training. We achieve high-fidelity audio synthesis with as little as…

Sound · Computer Science 2022-02-15 Siyuan Shan , Lamtharn Hantrakul , Jitong Chen , Matt Avent , David Trevelyan

Bass Accompaniment Generation via Latent Diffusion

The ability to automatically generate music that appropriately matches an arbitrary input track is a challenging task. We present a novel controllable system for generating single stems to accompany musical mixes of arbitrary length. At the…

Sound · Computer Science 2024-02-05 Marco Pasini , Maarten Grachten , Stefan Lattner

RAVE: A variational autoencoder for fast and high-quality neural audio synthesis

Deep generative models applied to audio have improved by a large margin the state-of-the-art in many speech and music related tasks. However, as raw waveform modelling remains an inherently difficult task, audio generative models are either…

Machine Learning · Computer Science 2021-12-16 Antoine Caillon , Philippe Esling

Controllable Generation of Implied Volatility Surfaces with Variational Autoencoders

This paper presents a deep generative modeling framework for controllably synthesizing implied volatility surfaces (IVSs) using a variational autoencoder (VAE). Unlike conventional data-driven models, our approach provides explicit control…

Computational Finance · Quantitative Finance 2025-09-03 Jing Wang , Shuaiqiang Liu , Cornelis Vuik

Interpretable Timbre Synthesis using Variational Autoencoders Regularized on Timbre Descriptors

Controllable timbre synthesis has been a subject of research for several decades, and deep neural networks have been the most successful in this area. Deep generative models such as Variational Autoencoders (VAEs) have the ability to…

Sound · Computer Science 2023-07-21 Anastasia Natsiou , Luca Longo , Sean O'Leary

Easily generating and absorbing waves using machine learning

High-order wave-making theories are becoming available but are limited to certain ranges of waves and wavemaker types in their applicability. Alternatively, machine learning can be considered to find nonlinear functional relationships.…

Fluid Dynamics · Physics 2022-02-24 Yulin Xie , Xizeng Zhao

Wavetable Synthesis Using CVAE for Timbre Control Based on Semantic Label

Synthesizers are essential in modern music production. However, their complex timbre parameters, often filled with technical terms, require expertise. This research introduces a method of timbre control in wavetable synthesis that is…

Sound · Computer Science 2024-10-25 Tsugumasa Yutani , Yuya Yamamoto , Shuyo Nakatani , Hiroko Terasawa

Controllable and Compositional Generation with Latent-Space Energy-Based Models

Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications, but it still remains as a great challenge. In particular, the compositional ability to generate novel…

Computer Vision and Pattern Recognition · Computer Science 2021-12-07 Weili Nie , Arash Vahdat , Anima Anandkumar

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis

In recent years, various flow-based generative models have been proposed to generate high-fidelity waveforms in real-time. However, these models require either a well-trained teacher network or a number of flow steps making them…

Sound · Computer Science 2020-07-06 Hyeongju Kim , Hyeonseung Lee , Woo Hyun Kang , Sung Jun Cheon , Byoung Jin Choi , Nam Soo Kim

Generative timbre spaces: regularizing variational auto-encoders with perceptual metrics

Timbre spaces have been used in music perception to study the perceptual relationships between instruments based on dissimilarity ratings. However, these spaces do not generalize to novel examples and do not provide an invertible mapping,…

Sound · Computer Science 2018-10-02 Philippe Esling , Axel Chemla--Romeu-Santos , Adrien Bitton

Composer Vector: Style-steering Symbolic Music Generation in a Latent Space

Symbolic music generation has made significant progress, yet achieving fine-grained and flexible control over composer style remains challenging. Existing training-based methods for composer style conditioning depend on large labeled…

Sound · Computer Science 2026-04-07 Xunyi Jiang , Mingyang Yao , Jingyue Huang , Julian McAuley

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

We propose Parallel WaveGAN, a distillation-free, fast, and small-footprint waveform generation method using a generative adversarial network. In the proposed method, a non-autoregressive WaveNet is trained by jointly optimizing…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-07 Ryuichi Yamamoto , Eunwoo Song , Jae-Min Kim