Related papers: Vector-Quantized Timbre Representation

Timbre latent space: exploration and creative aspects

Recent studies show the ability of unsupervised models to learn invertible audio representations using Auto-Encoders. They enable high-quality sound synthesis but a limited control since the latent spaces do not disentangle timbre…

Sound · Computer Science 2020-08-18 Antoine Caillon , Adrien Bitton , Brice Gatinet , Philippe Esling

Interpretable Timbre Synthesis using Variational Autoencoders Regularized on Timbre Descriptors

Controllable timbre synthesis has been a subject of research for several decades, and deep neural networks have been the most successful in this area. Deep generative models such as Variational Autoencoders (VAEs) have the ability to…

Sound · Computer Science 2023-07-21 Anastasia Natsiou , Luca Longo , Sean O'Leary

Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders

In this paper, we learn disentangled representations of timbre and pitch for musical instrument sounds. We adapt a framework based on variational autoencoders with Gaussian mixture latent distributions. Specifically, we use two separate…

Machine Learning · Computer Science 2019-07-02 Yin-Jyun Luo , Kat Agres , Dorien Herremans

Generative timbre spaces: regularizing variational auto-encoders with perceptual metrics

Timbre spaces have been used in music perception to study the perceptual relationships between instruments based on dissimilarity ratings. However, these spaces do not generalize to novel examples and do not provide an invertible mapping,…

Sound · Computer Science 2018-10-02 Philippe Esling , Axel Chemla--Romeu-Santos , Adrien Bitton

Pitch-Conditioned Instrument Sound Synthesis From an Interactive Timbre Latent Space

This paper presents a novel approach to neural instrument sound synthesis using a two-stage semi-supervised learning framework capable of generating pitch-accurate, high-quality music samples from an expressive timbre latent space. Existing…

Sound · Computer Science 2025-10-07 Christian Limberg , Fares Schulz , Zhe Zhang , Stefan Weinzierl

Real-time Timbre Remapping with Differentiable DSP

Timbre is a primary mode of expression in diverse musical contexts. However, prevalent audio-driven synthesis methods predominantly rely on pitch and loudness envelopes, effectively flattening timbral expression from the input. Our approach…

Sound · Computer Science 2024-07-08 Jordie Shier , Charalampos Saitis , Andrew Robertson , Andrew McPherson

Timbre Transfer with Variational Auto Encoding and Cycle-Consistent Adversarial Networks

This research project investigates the application of deep learning to timbre transfer, where the timbre of a source audio can be converted to the timbre of a target audio with minimal loss in quality. The adopted approach combines…

Sound · Computer Science 2021-10-12 Russell Sammut Bonnici , Charalampos Saitis , Martin Benning

Timbre Space Representation of a Subtractive Synthesizer

In this study, we produce a geometrically scaled perceptual timbre space from dissimilarity ratings of subtractive synthesized sounds and correlate the resulting dimensions with a set of acoustic descriptors. We curate a set of 15 sounds,…

Sound · Computer Science 2020-09-25 Cyrus Vahidi , George Fazekas , Charalampos Saitis , Alessandro Palladini

A Semantic Timbre Dataset for the Electric Guitar

Understanding and manipulating timbre is central to audio synthesis, yet this remains under-explored in machine learning due to a lack of annotated datasets linking perceptual timbre dimensions to semantic descriptors. We present the…

Sound · Computer Science 2026-03-18 Joseph Cameron , Alan Blackwell

Modulated Variational auto-Encoders for many-to-many musical timbre transfer

Generative models have been successfully applied to image style transfer and domain translation. However, there is still a wide gap in the quality of results when learning such tasks on musical audio. Furthermore, most translation models…

Sound · Computer Science 2018-10-02 Adrien Bitton , Philippe Esling , Axel Chemla-Romeu-Santos

Introducing voice timbre attribute detection

This paper focuses on explaining the timbre conveyed by speech signals and introduces a task termed voice timbre attribute detection (vTAD). In this task, voice timbre is explained with a set of sensory attributes describing its human…

Sound · Computer Science 2025-06-24 Jinghao He , Zhengyan Sheng , Liping Chen , Kong Aik Lee , Zhen-Hua Ling

Voice Timbre Attribute Detection with Compact and Interpretable Training-Free Acoustic Parameters

Voice timbre attribute detection (vTAD) is the task of determining the relative intensity of timbre attributes between speech utterances. Voice timbre is a crucial yet inherently complex component of speech perception. While deep neural…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-06 Aemon Yat Fei Chiu , Yujia Xiao , Qiuqiang Kong , Tan Lee

Musical Instrument Classification via Low-Dimensional Feature Vectors

Music is a mysterious language that conveys feeling and thoughts via different tones and timbre. For better understanding of timbre in music, we chose music data of 6 representative instruments, analysed their timbre features and classified…

Sound · Computer Science 2022-07-15 Zishuo Zhao , Haoyun Wang

Timbre Perception, Representation, and its Neuroscientific Exploration: A Comprehensive Review

Timbre, the sound's unique "color", is fundamental to how we perceive and appreciate music. This review explores the multifaceted world of timbre perception and representation. It begins by tracing the word's origin, offering an intuitive…

Sound · Computer Science 2024-05-24 Hong Zhang , Jie Lin , Shengxuan Chen

Learning Disentangled Representations for Timber and Pitch in Music Audio

Timbre and pitch are the two main perceptual properties of musical sounds. Depending on the target applications, we sometimes prefer to focus on one of them, while reducing the effect of the other. Researchers have managed to hand-craft…

Sound · Computer Science 2018-11-09 Yun-Ning Hung , Yi-An Chen , Yi-Hsuan Yang

DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation

Existing work on pitch and timbre disentanglement has been mostly focused on single-instrument music audio, excluding the cases where multiple instruments are presented. To fill the gap, we propose DisMix, a generative framework in which…

Sound · Computer Science 2024-08-21 Yin-Jyun Luo , Kin Wai Cheuk , Woosung Choi , Toshimitsu Uesaka , Keisuke Toyama , Koichi Saito , Chieh-Hsin Lai , Yuhta Takida , Wei-Hsiang Liao , Simon Dixon , Yuki Mitsufuji

Hierarchical Timbre-Painting and Articulation Generation

We present a fast and high-fidelity method for music generation, based on specified f0 and loudness, such that the synthesized audio mimics the timbre and articulation of a target instrument. The generation process consists of learned…

Audio and Speech Processing · Electrical Eng. & Systems 2020-09-08 Michael Michelashvili , Lior Wolf

Introducing Latent Timbre Synthesis

We present the Latent Timbre Synthesis (LTS), a new audio synthesis method using Deep Learning. The synthesis method allows composers and sound designers to interpolate and extrapolate between the timbre of multiple sounds using the latent…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-03 K. Tatar , D. Bisig , P. Pasquier

Combining audio control and style transfer using latent diffusion

Deep generative models are now able to synthesize high-quality audio signals, shifting the critical aspect in their development from audio quality to control capabilities. Although text-to-music generation is getting largely adopted by the…

Sound · Computer Science 2024-08-02 Nils Demerlé , Philippe Esling , Guillaume Doras , David Genova

TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer

In this work, we address the problem of musical timbre transfer, where the goal is to manipulate the timbre of a sound sample from one instrument to match another instrument while preserving other musical content, such as pitch, rhythm, and…

Sound · Computer Science 2023-10-24 Sicong Huang , Qiyang Li , Cem Anil , Xuchan Bao , Sageev Oore , Roger B. Grosse