Related papers: Hyperbolic Audio Source Separation

Hyperbolic Additive Margin Softmax with Hierarchical Information for Speaker Verification

Speaker embedding learning based on Euclidean space has achieved significant progress, but it is still insufficient in modeling hierarchical information within speaker features. Hyperbolic space, with its negative curvature geometric…

Sound · Computer Science 2026-04-29 Zhihua Fang , Liang He

Hyperbolic Distance-Based Speech Separation

In this work, we explore the task of hierarchical distance-based speech separation defined on a hyperbolic manifold. Based on the recent advent of audio-related tasks performed in non-Euclidean spaces, we propose to make use of the…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-09 Darius Petermann , Minje Kim

Improving Universal Sound Separation Using Sound Classification

Deep learning approaches have recently achieved impressive performance on both audio source separation and sound classification. Most audio source separation approaches focus only on separating sources belonging to a restricted domain of…

Sound · Computer Science 2021-05-14 Efthymios Tzinis , Scott Wisdom , John R. Hershey , Aren Jansen , Daniel P. W. Ellis

Hyperbolic Diffusion Embedding and Distance for Hierarchical Representation Learning

Finding meaningful representations and distances of hierarchical data is important in many fields. This paper presents a new method for hierarchical data embedding and distance. Our method relies on combining diffusion geometry, a central…

Machine Learning · Computer Science 2023-05-31 Ya-Wei Eileen Lin , Ronald R. Coifman , Gal Mishne , Ronen Talmon

Hyperbolic Distance Matrices

Hyperbolic space is a natural setting for mining and visualizing data with hierarchical structure. In order to compute a hyperbolic embedding from comparison or similarity information, one has to solve a hyperbolic distance geometry…

Machine Learning · Computer Science 2020-09-14 Puoya Tabaghi , Ivan Dokmanić

Unsupervised Music Source Separation Using Differentiable Parametric Source Models

Supervised deep learning approaches to underdetermined audio source separation achieve state-of-the-art performance but require a dataset of mixtures along with their corresponding isolated source signals. Such datasets can be extremely…

Sound · Computer Science 2023-02-01 Kilian Schulze-Forster , Gaël Richard , Liam Kelley , Clement S. J. Doire , Roland Badeau

SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source Separation

Recent advancements in music source separation have significantly progressed, particularly in isolating vocals, drums, and bass elements from mixed tracks. These developments owe much to the creation and use of large-scale, multitrack…

Audio and Speech Processing · Electrical Eng. & Systems 2025-02-18 Jaime Garcia-Martinez , David Diaz-Guerra , Archontis Politis , Tuomas Virtanen , Julio J. Carabias-Orti , Pedro Vera-Candeas

Learning to Separate Voices by Spatial Regions

We consider the problem of audio voice separation for binaural applications, such as earphones and hearing aids. While today's neural networks perform remarkably well (separating $4+$ sources with 2 microphones) they assume a known or fixed…

Sound · Computer Science 2022-07-18 Zhongweiyang Xu , Romit Roy Choudhury

Disentangling speech from surroundings with neural embeddings

We present a method to separate speech signals from noisy environments in the embedding space of a neural audio codec. We introduce a new training procedure that allows our model to produce structured encodings of audio waveforms given by…

Sound · Computer Science 2023-06-06 Ahmed Omran , Neil Zeghidour , Zalán Borsos , Félix de Chaumont Quitry , Malcolm Slaney , Marco Tagliasacchi

Deep clustering: Discriminative embeddings for segmentation and separation

We address the problem of acoustic source separation in a deep learning framework we call "deep clustering." Rather than directly estimating signals or masking functions, we train a deep network to produce spectrogram embeddings that are…

Neural and Evolutionary Computing · Computer Science 2015-08-19 John R. Hershey , Zhuo Chen , Jonathan Le Roux , Shinji Watanabe

Sound Localization and Separation in Three-dimensional Space Using a Single Microphone with a Metamaterial Enclosure

Conventional approaches to sound localization and separation are based on microphone arrays in artificial systems. Inspired by the selective perception of human auditory system, we design a multi-source listening system which can separate…

Sound · Computer Science 2019-11-11 Xuecong Sun , Han Jia , Zhe Zhang , Yuzhen Yang , Zhaoyong Sun , Jun Yang

Audio-Visual Separation with Hierarchical Fusion and Representation Alignment

Self-supervised audio-visual source separation leverages natural correlations between audio and vision modalities to separate mixed audio signals. In this work, we first systematically analyse the performance of existing multimodal fusion…

Multimedia · Computer Science 2025-10-10 Han Hu , Dongheng Lin , Qiming Huang , Yuqi Hou , Hyung Jin Chang , Jianbo Jiao

Unsupervised Single-Channel Speech Separation with a Diffusion Prior under Speaker-Embedding Guidance

Speech separation is a fundamental task in audio processing, typically addressed with fully supervised systems trained on paired mixtures. While effective, such systems typically rely on synthetic data pipelines, which may not reflect…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-30 Runwu Shi , Kai Li , Chang Li , Jiang Wang , Sihan Tan , Kazuhiro Nakadai

Hyperbolic Vision Transformers: Combining Improvements in Metric Learning

Metric learning aims to learn a highly discriminative model encouraging the embeddings of similar classes to be close in the chosen metrics and pushed apart for dissimilar ones. The common recipe is to use an encoder to extract embeddings…

Computer Vision and Pattern Recognition · Computer Science 2022-03-23 Aleksandr Ermolov , Leyla Mirvakhabova , Valentin Khrulkov , Nicu Sebe , Ivan Oseledets

Hyperbolic Space with Hierarchical Margin Boosts Fine-Grained Learning from Coarse Labels

Learning fine-grained embeddings from coarse labels is a challenging task due to limited label granularity supervision, i.e., lacking the detailed distinctions required for fine-grained tasks. The task becomes even more demanding when…

Computer Vision and Pattern Recognition · Computer Science 2023-11-21 Shu-Lin Xu , Yifan Sun , Faen Zhang , Anqi Xu , Xiu-Shen Wei , Yi Yang

Class-conditional embeddings for music source separation

Isolating individual instruments in a musical mixture has a myriad of potential applications, and seems imminently achievable given the levels of performance reached by recent deep learning methods. While most musical source separation…

Sound · Computer Science 2018-11-08 Prem Seetharaman , Gordon Wichern , Shrikant Venkataramani , Jonathan Le Roux

A Recurrent Encoder-Decoder Approach with Skip-filtering Connections for Monaural Singing Voice Separation

The objective of deep learning methods based on encoder-decoder architectures for music source separation is to approximate either ideal time-frequency masks or spectral representations of the target music source(s). The spectral…

Sound · Computer Science 2018-04-25 Stylianos Ioannis Mimilakis , Konstantinos Drossos , Tuomas Virtanen , Gerald Schuller

Hyperbolic Embeddings for Order-Aware Classification of Audio Effect Chains

Audio effects (AFXs) are essential tools in music production, frequently applied in chains to shape timbre and dynamics. The order of AFXs in a chain plays a crucial role in determining the final sound, particularly when non-linear (e.g.,…

Sound · Computer Science 2025-07-30 Aogu Wada , Tomohiko Nakamura , Hiroshi Saruwatari

Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds

In this paper we address the problems of modeling the acoustic space generated by a full-spectrum sound source and of using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum…

Sound · Computer Science 2015-02-06 Antoine Deleforge , Florence Forbes , Radu Horaud

Hyperbolic embedding of multilayer networks

Multilayer networks offer a powerful framework for modeling complex systems across diverse domains, effectively capturing multiple types of connections and interdependent subsystems commonly found in real world scenarios. To analyze these…

Social and Information Networks · Computer Science 2026-02-20 Martin Guillemaud , Vera Dinkelacker , Mario Chavez