English
Related papers

Related papers: Hyperbolic Audio Source Separation

200 papers

Speaker embedding learning based on Euclidean space has achieved significant progress, but it is still insufficient in modeling hierarchical information within speaker features. Hyperbolic space, with its negative curvature geometric…

Sound · Computer Science 2026-04-29 Zhihua Fang , Liang He

In this work, we explore the task of hierarchical distance-based speech separation defined on a hyperbolic manifold. Based on the recent advent of audio-related tasks performed in non-Euclidean spaces, we propose to make use of the…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-09 Darius Petermann , Minje Kim

Deep learning approaches have recently achieved impressive performance on both audio source separation and sound classification. Most audio source separation approaches focus only on separating sources belonging to a restricted domain of…

Sound · Computer Science 2021-05-14 Efthymios Tzinis , Scott Wisdom , John R. Hershey , Aren Jansen , Daniel P. W. Ellis

Finding meaningful representations and distances of hierarchical data is important in many fields. This paper presents a new method for hierarchical data embedding and distance. Our method relies on combining diffusion geometry, a central…

Machine Learning · Computer Science 2023-05-31 Ya-Wei Eileen Lin , Ronald R. Coifman , Gal Mishne , Ronen Talmon

Hyperbolic space is a natural setting for mining and visualizing data with hierarchical structure. In order to compute a hyperbolic embedding from comparison or similarity information, one has to solve a hyperbolic distance geometry…

Machine Learning · Computer Science 2020-09-14 Puoya Tabaghi , Ivan Dokmanić

Supervised deep learning approaches to underdetermined audio source separation achieve state-of-the-art performance but require a dataset of mixtures along with their corresponding isolated source signals. Such datasets can be extremely…

Recent advancements in music source separation have significantly progressed, particularly in isolating vocals, drums, and bass elements from mixed tracks. These developments owe much to the creation and use of large-scale, multitrack…

Audio and Speech Processing · Electrical Eng. & Systems 2025-02-18 Jaime Garcia-Martinez , David Diaz-Guerra , Archontis Politis , Tuomas Virtanen , Julio J. Carabias-Orti , Pedro Vera-Candeas

We consider the problem of audio voice separation for binaural applications, such as earphones and hearing aids. While today's neural networks perform remarkably well (separating $4+$ sources with 2 microphones) they assume a known or fixed…

Sound · Computer Science 2022-07-18 Zhongweiyang Xu , Romit Roy Choudhury

We present a method to separate speech signals from noisy environments in the embedding space of a neural audio codec. We introduce a new training procedure that allows our model to produce structured encodings of audio waveforms given by…

We address the problem of acoustic source separation in a deep learning framework we call "deep clustering." Rather than directly estimating signals or masking functions, we train a deep network to produce spectrogram embeddings that are…

Neural and Evolutionary Computing · Computer Science 2015-08-19 John R. Hershey , Zhuo Chen , Jonathan Le Roux , Shinji Watanabe

Conventional approaches to sound localization and separation are based on microphone arrays in artificial systems. Inspired by the selective perception of human auditory system, we design a multi-source listening system which can separate…

Sound · Computer Science 2019-11-11 Xuecong Sun , Han Jia , Zhe Zhang , Yuzhen Yang , Zhaoyong Sun , Jun Yang

Self-supervised audio-visual source separation leverages natural correlations between audio and vision modalities to separate mixed audio signals. In this work, we first systematically analyse the performance of existing multimodal fusion…

Multimedia · Computer Science 2025-10-10 Han Hu , Dongheng Lin , Qiming Huang , Yuqi Hou , Hyung Jin Chang , Jianbo Jiao

Speech separation is a fundamental task in audio processing, typically addressed with fully supervised systems trained on paired mixtures. While effective, such systems typically rely on synthetic data pipelines, which may not reflect…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-30 Runwu Shi , Kai Li , Chang Li , Jiang Wang , Sihan Tan , Kazuhiro Nakadai

Metric learning aims to learn a highly discriminative model encouraging the embeddings of similar classes to be close in the chosen metrics and pushed apart for dissimilar ones. The common recipe is to use an encoder to extract embeddings…

Computer Vision and Pattern Recognition · Computer Science 2022-03-23 Aleksandr Ermolov , Leyla Mirvakhabova , Valentin Khrulkov , Nicu Sebe , Ivan Oseledets

Learning fine-grained embeddings from coarse labels is a challenging task due to limited label granularity supervision, i.e., lacking the detailed distinctions required for fine-grained tasks. The task becomes even more demanding when…

Computer Vision and Pattern Recognition · Computer Science 2023-11-21 Shu-Lin Xu , Yifan Sun , Faen Zhang , Anqi Xu , Xiu-Shen Wei , Yi Yang

Isolating individual instruments in a musical mixture has a myriad of potential applications, and seems imminently achievable given the levels of performance reached by recent deep learning methods. While most musical source separation…

Sound · Computer Science 2018-11-08 Prem Seetharaman , Gordon Wichern , Shrikant Venkataramani , Jonathan Le Roux

The objective of deep learning methods based on encoder-decoder architectures for music source separation is to approximate either ideal time-frequency masks or spectral representations of the target music source(s). The spectral…

Audio effects (AFXs) are essential tools in music production, frequently applied in chains to shape timbre and dynamics. The order of AFXs in a chain plays a crucial role in determining the final sound, particularly when non-linear (e.g.,…

Sound · Computer Science 2025-07-30 Aogu Wada , Tomohiko Nakamura , Hiroshi Saruwatari

In this paper we address the problems of modeling the acoustic space generated by a full-spectrum sound source and of using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum…

Sound · Computer Science 2015-02-06 Antoine Deleforge , Florence Forbes , Radu Horaud

Multilayer networks offer a powerful framework for modeling complex systems across diverse domains, effectively capturing multiple types of connections and interdependent subsystems commonly found in real world scenarios. To analyze these…

Social and Information Networks · Computer Science 2026-02-20 Martin Guillemaud , Vera Dinkelacker , Mario Chavez
‹ Prev 1 2 3 10 Next ›