English
Related papers

Related papers: X-DC: Explainable Deep Clustering based on Learnab…

200 papers

The x-vector based deep neural network (DNN) embedding systems have demonstrated effectiveness for text-independent speaker verification. This paper presents a multi-task learning architecture for training the speaker embedding DNN with the…

Audio and Speech Processing · Electrical Eng. & Systems 2019-04-05 Lanhua You , Wu Guo , Lirong Dai , Jun Du

We address the problem of acoustic source separation in a deep learning framework we call "deep clustering." Rather than directly estimating signals or masking functions, we train a deep network to produce spectrogram embeddings that are…

Neural and Evolutionary Computing · Computer Science 2015-08-19 John R. Hershey , Zhuo Chen , Jonathan Le Roux , Shinji Watanabe

We explore why deep convolutional neural networks (CNNs) with small two-dimensional kernels, primarily used for modeling spatial relations in images, are also effective in speech recognition. We analyze the representations learned by deep…

Computation and Language · Computer Science 2018-11-13 Joanna Rownicka , Peter Bell , Steve Renals

Deep neural networks (DNNs) have been shown to outperform traditional machine learning algorithms in a broad variety of application domains due to their effectiveness in modeling complex problems and handling high-dimensional datasets. Many…

Deep clustering is the first method to handle general audio separation scenarios with multiple sources of the same type and an arbitrary number of sources, performing impressively in speaker-independent speech separation tasks. However,…

Machine Learning · Statistics 2017-11-30 Yi Luo , Zhuo Chen , John R. Hershey , Jonathan Le Roux , Nima Mesgarani

The so-called black-box deep learning (DL) models are increasingly used in classification tasks across many scientific disciplines, including wireless communications domain. In this trend, supervised DL models appear as most commonly…

Machine Learning · Computer Science 2025-08-04 Ljupcho Milosheski , Gregor Cerar , Blaž Bertalanič , Carolina Fortuna , Mihael Mohorčič

Deep clustering (DC) and utterance-level permutation invariant training (uPIT) have been demonstrated promising for speaker-independent speech separation. DC is usually formulated as two-step processes: embedding learning and embedding…

Sound · Computer Science 2019-07-24 Cunhang Fan , Bin Liu , Jianhua Tao , Jiangyan Yi , Zhengqi Wen

We propose a novel method to explain trained deep neural networks (DNNs), by distilling them into surrogate models using unsupervised clustering. Our method can be applied flexibly to any subset of layers of a DNN architecture and can…

Computer Vision and Pattern Recognition · Computer Science 2020-07-17 Yu-han Liu , Sercan O. Arik

Traditional clustering methods often perform clustering with low-level indiscriminative representations and ignore relationships between patterns, resulting in slight achievements in the era of deep learning. To handle this problem, we…

Machine Learning · Computer Science 2019-05-07 Jianlong Chang , Yiwen Guo , Lingfeng Wang , Gaofeng Meng , Shiming Xiang , Chunhong Pan

Monaural speech dereverberation is a very challenging task because no spatial cues can be used. When the additive noises exist, this task becomes more challenging. In this paper, we propose a joint training method for simultaneous speech…

Audio and Speech Processing · Electrical Eng. & Systems 2020-04-07 Cunhang Fan , Jianhua Tao , Bin Liu , Jiangyan Yi , Zhengqi Wen

The ability of deep learning (DL) to improve the practice of medicine and its clinical outcomes faces a looming obstacle: model interpretation. Without description of how outputs are generated, a collaborating physician can neither resolve…

Machine Learning · Computer Science 2020-06-30 Christopher Snyder , Sriram Vishwanath

Modulation recognition is an important task in radio signal processing. Most of the current researches focus on supervised learning. However, in many real scenarios, it is difficult and cost to obtain the labels of signals. In this letter,…

Signal Processing · Electrical Eng. & Systems 2021-07-27 Qi Xuan , Xiaohui Li , Zhuangzhi Chen , Dongwei Xu , Shilian Zheng , Xiaoniu Yang

Deep neural networks (DNNs) have achieved impressive predictive performance due to their ability to learn complex, non-linear relationships between variables. However, the inability to effectively visualize these relationships has led to…

Machine Learning · Computer Science 2019-01-17 Chandan Singh , W. James Murdoch , Bin Yu

Deep neural networks (DNN) are able to successfully process and classify speech utterances. However, understanding the reason behind a classification by DNN is difficult. One such debugging method used with image classification DNNs is…

Machine Learning · Computer Science 2019-07-09 Bilal Soomro , Anssi Kanervisto , Trung Ngo Trong , Ville Hautamäki

Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural networks can accurately capture speaker discriminative characteristics and popular deep embeddings such as x-vectors are nowadays a fundamental…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-14 Nauman Dawalatabad , Mirco Ravanelli , François Grondin , Jenthe Thienpondt , Brecht Desplanques , Hwidong Na

Recent advancements in machine learning and signal processing domains have resulted in an extensive surge of interest in Deep Neural Networks (DNNs) due to their unprecedented performance and high accuracy for different and challenging…

Machine Learning · Computer Science 2021-02-04 Atefeh Shahroudnejad

This paper presents an improved deep embedding learning method based on convolutional neural network (CNN) for text-independent speaker verification. Two improvements are proposed for x-vector embedding learning: (1) Multi-scale convolution…

Audio and Speech Processing · Electrical Eng. & Systems 2020-01-15 Bin Gu , Wu Guo

In this paper, we propose an iterative framework for self-supervised speaker representation learning based on a deep neural network (DNN). The framework starts with training a self-supervision speaker embedding network by maximizing…

Audio and Speech Processing · Electrical Eng. & Systems 2020-10-29 Danwei Cai , Weiqing Wang , Ming Li

Deep neural networks (DNN) techniques have become pervasive in domains such as natural language processing and computer vision. They have achieved great success in these domains in task such as machine translation and image generation. Due…

Sound · Computer Science 2023-06-21 Peter Ochieng

Recent studies have been revisiting whole words as the basic modelling unit in speech recognition and query applications, instead of phonetic units. Such whole-word segmental systems rely on a function that maps a variable-length speech…

Computation and Language · Computer Science 2016-01-11 Herman Kamper , Weiran Wang , Karen Livescu
‹ Prev 1 2 3 10 Next ›