English
Related papers

Related papers: Multi-Task Audio Source Separation

200 papers

In short video and live broadcasts, speech, singing voice, and background music often overlap and obscure each other. This complexity creates difficulties in structuring and recognizing the audio content, which may impair subsequent ASR and…

Sound · Computer Science 2024-04-18 Ye Bai , Chenxing Li , Hao Li , Yuanyuan Zhao , Xiaorui Wang

Universal source separation targets at separating the audio sources of an arbitrary mix, removing the constraint to operate on a specific domain like speech or music. Yet, the potential of universal source separation is limited because most…

Sound · Computer Science 2023-10-03 Jordi Pons , Xiaoyu Liu , Santiago Pascual , Joan Serrà

Several attempts have been made to handle multiple source separation tasks such as speech enhancement, speech separation, sound event separation, music source separation (MSS), or cinematic audio source separation (CASS) with a single…

Audio and Speech Processing · Electrical Eng. & Systems 2024-11-01 Kohei Saijo , Janek Ebbers , François G. Germain , Gordon Wichern , Jonathan Le Roux

A main challenge in applying deep learning to music processing is the availability of training data. One potential solution is Multi-task Learning, in which the model also learns to solve related auxiliary tasks on additional datasets to…

Sound · Computer Science 2018-04-06 Daniel Stoller , Sebastian Ewert , Simon Dixon

Music source separation (MSS) is the task of separating a music piece into individual sources, such as vocals and accompaniment. Recently, neural network based methods have been applied to address the MSS problem, and can be categorized…

Sound · Computer Science 2021-02-22 Xuchen Song , Qiuqiang Kong , Xingjian Du , Yuxuan Wang

Monaural source separation is important for many real world applications. It is challenging because, with only a single channel of information available, without any constraints, an infinite number of solutions are possible. In this paper,…

Sound · Computer Science 2015-10-02 Po-Sen Huang , Minje Kim , Mark Hasegawa-Johnson , Paris Smaragdis

Music demixing is the task of separating different tracks from the given single audio signal into components, such as drums, bass, and vocals from the rest of the accompaniment. Separation of sources is useful for a range of areas,…

Sound · Computer Science 2024-05-08 Roman Solovyev , Alexander Stempkovskiy , Tatiana Habruseva

This paper deals with the problem of audio source separation. To handle the complex and ill-posed nature of the problems of audio source separation, the current state-of-the-art approaches employ deep neural networks to obtain instrumental…

Sound · Computer Science 2017-06-30 Naoya Takahashi , Yuki Mitsufuji

Music source separation (MSS) aims to separate mixed music into its distinct tracks, such as vocals, bass, drums, and more. MSS is considered to be a challenging audio separation task due to the complexity of music signals. Although the RNN…

Sound · Computer Science 2024-09-16 Jinglin Bai , Yuan Fang , Jiajie Wang , Xueliang Zhang

Deep neural networks have become an indispensable technique for audio source separation (ASS). It was recently reported that a variant of CNN architecture called MMDenseNet was successfully employed to solve the ASS problem of estimating…

Sound · Computer Science 2018-05-30 Naoya Takahashi , Nabarun Goswami , Yuki Mitsufuji

Audio source separation is a difficult machine learning problem and performance is measured by comparing extracted signals with the component source signals. However, if separation is motivated by the ultimate goal of re-mixing then…

Sound · Computer Science 2015-05-05 Andrew J. R Simpson , Gerard Roma , Mark D. Plumbley

Cinematic audio source separation (CASS), as a problem of extracting the dialogue, music, and effects stems from their mixture, is a relatively new subtask of audio source separation. To date, only one publicly available dataset exists for…

Audio and Speech Processing · Electrical Eng. & Systems 2024-08-27 Karn N. Watcharasupat , Chih-Wei Wu , Iroro Orife

Propelled by the breakthrough in deep generative models, audio-to-image generation has emerged as a pivotal cross-modal task that converts complex auditory signals into rich visual representations. However, previous works only focus on…

Sound · Computer Science 2025-12-11 Hao Zhou , Xiaobao Guo , Yuzhe Zhu , Adams Wai-Kin Kong

Supervised deep learning approaches to underdetermined audio source separation achieve state-of-the-art performance but require a dataset of mixtures along with their corresponding isolated source signals. Such datasets can be extremely…

We consider the problem of audio voice separation for binaural applications, such as earphones and hearing aids. While today's neural networks perform remarkably well (separating $4+$ sources with 2 microphones) they assume a known or fixed…

Sound · Computer Science 2022-07-18 Zhongweiyang Xu , Romit Roy Choudhury

Recent audio-visual generative models have made substantial progress in generating images from audio. However, existing approaches focus on generating images from single-class audio and fail to generate images from mixed audio. To address…

Computer Vision and Pattern Recognition · Computer Science 2025-04-28 Minjae Kang , Martim Brandão

Music source separation is a core task in music information retrieval which has seen a dramatic improvement in the past years. Nevertheless, most of the existing systems focus exclusively on the problem of source separation itself and…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-04 Yun-Ning Hung , Alexander Lerch

Speech data collected in real-world scenarios often encounters two issues. First, multiple sources may exist simultaneously, and the number of sources may vary with time. Second, the existence of background noise in recording is inevitable.…

Sound · Computer Science 2020-05-21 Yuan-Kuei Wu , Chao-I Tuan , Hung-yi Lee , Yu Tsao

Music source separation (MSS) is a task that involves isolating individual sound sources, or stems, from mixed audio signals. This paper presents an ensemble approach to MSS, combining several state-of-the-art architectures to achieve…

Sound · Computer Science 2024-10-29 Saarth Vardhan , Pavani R Acharya , Samarth S Rao , Oorjitha Ratna Jasthi , S Natarajan

Recent breakthroughs in language-queried audio source separation (LASS) have shown that generative models can achieve higher separation audio quality than traditional masking-based approaches. However, two key limitations restrict their…

‹ Prev 1 2 3 10 Next ›