English
Related papers

Related papers: MDCNN-SID: Multi-scale Dilated Convolution Network…

200 papers

Flamenco singing is characterized by pitch instability, micro-tonal ornamentations, large vibrato ranges, and a high degree of melodic variability. These musical features make the automatic identification of flamenco singers a difficult…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-04 Aitor Arronte Alvarez , Francisco Gomez-Martin

We present a multi-modal Deep Neural Network (DNN) approach for bird song identification. The presented approach takes both audio samples and metadata as input. The audio is fed into a Convolutional Neural Network (CNN) using four…

Sound · Computer Science 2018-11-13 Botond Fazeka , Alexander Schindler , Thomas Lidy , Andreas Rauber

Metaverse has stretched the real world into unlimited space. There will be more live concerts in Metaverse. The task of singer identification is to identify the song belongs to which singer. However, there has been a tough problem in singer…

Sound · Computer Science 2022-05-25 Xulong Zhang , Jianzong Wang , Ning Cheng , Jing Xiao

We propose a flexible framework that deals with both singer conversion and singers vocal technique conversion. The proposed model is trained on non-parallel corpora, accommodates many-to-many conversion, and leverages recent advances of…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-26 Yin-Jyun Luo , Chin-Chen Hsu , Kat Agres , Dorien Herremans

The present paper describes singing voice synthesis based on convolutional neural networks (CNNs). Singing voice synthesis systems based on deep neural networks (DNNs) are currently being proposed and are improving the naturalness of…

Audio and Speech Processing · Electrical Eng. & Systems 2020-04-23 Kazuhiro Nakamura , Shinji Takaki , Kei Hashimoto , Keiichiro Oura , Yoshihiko Nankaku , Keiichi Tokuda

Deep neural networks with convolutional layers usually process the entire spectrogram of an audio signal with the same time-frequency resolutions, number of filters, and dimensionality reduction scale. According to the constant-Q transform,…

Sound · Computer Science 2019-10-22 Emad M. Grais , Fei Zhao , Mark D. Plumbley

Music source separation involves a large input field to model a long-term dependence of an audio signal. Previous convolutional neural network (CNN)-based approaches address the large input field modeling using sequentially down- and…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-30 Naoya Takahashi , Yuki Mitsufuji

Recent approaches for music source separation are almost exclusively based on deep neural networks, mostly employing recurrent neural networks (RNNs). Although RNNs are in many cases superior than other types of deep neural networks for…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-08 Pyry Pyykkönen , Styliannos I. Mimilakis , Konstantinos Drossos , Tuomas Virtanen

Previous attempts at music artist classification use frame level audio features which summarize frequency content within short intervals of time. Comparatively, more recent music information retrieval tasks take advantage of temporal…

Sound · Computer Science 2019-03-18 Zain Nasrullah , Yue Zhao

The present paper describes a singing voice synthesis based on convolutional neural networks (CNNs). Singing voice synthesis systems based on deep neural networks (DNNs) are currently being proposed and are improving the naturalness of…

Audio and Speech Processing · Electrical Eng. & Systems 2019-06-26 Kazuhiro Nakamura , Kei Hashimoto , Keiichiro Oura , Yoshihiko Nankaku , Keiichi Tokuda

Professional vocalists modulate their voice timbre or pitch to make their vocal performance more expressive. Such fluctuations are called singing techniques. Automatic detection of singing techniques from audio tracks can be beneficial to…

Sound · Computer Science 2023-06-27 Yuya Yamamoto , Juhan Nam , Hiroko Terasawa

Cover song identification represents a challenging task in the field of Music Information Retrieval (MIR) due to complex musical variations between query tracks and cover versions. Previous works typically utilize hand-crafted features and…

Multimedia · Computer Science 2019-11-04 Zhesong Yu , Xiaoshuo Xu , Xiaoou Chen , Deshun Yang

Previous approaches in singer identification have used one of monophonic vocal tracks or mixed tracks containing multiple instruments, leaving a semantic gap between these two domains of audio. In this paper, we present a system to learn a…

Sound · Computer Science 2019-06-27 Kyungyun Lee , Juhan Nam

In this paper, we study the issue of automatic singer identification (SID) in popular music recordings, which aims to recognize who sang a given piece of song. The main challenge for this investigation lies in the fact that a singer's…

Sound · Computer Science 2021-02-23 Xulong Zhang , Jiale Qian , Yi Yu , Yifu Sun , Wei Li

A new musical instrument classification method using convolutional neural networks (CNNs) is presented in this paper. Unlike the traditional methods, we investigated a scheme for classifying musical instruments using the learned features…

Sound · Computer Science 2015-12-24 Taejin Park , Taejin Lee

Time Delay Neural Networks (TDNN)-based methods are widely used in dialect identification. However, in previous work with TDNN application, subtle variant is being neglected in different feature scales. To address this issue, we propose a…

Computation and Language · Computer Science 2021-08-18 Tianlong Kong , Shouyi Yin , Dawei Zhang , Wang Geng , Xin Wang , Dandan Song , Jinwen Huang , Huiyu Shi , Xiaorui Wang

Convolutional neural networks (CNN) recently gained notable attraction in a variety of machine learning tasks: including music classification and style tagging. In this work, we propose implementing intermediate connections to the CNN…

Sound · Computer Science 2019-06-18 Nima Hamidi , Mohsen Vahidzadeh , Stephen Baek

Deep learning is progressively gaining popularity as a viable alternative to i-vectors for speaker recognition. Promising results have been recently obtained with Convolutional Neural Networks (CNNs) when fed by raw speech samples directly.…

Audio and Speech Processing · Electrical Eng. & Systems 2019-08-12 Mirco Ravanelli , Yoshua Bengio

Significant strides have been made in creating voice identity representations using speech data. However, the same level of progress has not been achieved for singing voices. To bridge this gap, we suggest a framework for training singer…

Sound · Computer Science 2024-01-11 Bernardo Torres , Stefan Lattner , Gaël Richard

State-of-the-art text-independent speaker verification systems typically use cepstral features or filter bank energies as speech features. Recent studies attempted to extract speaker embeddings directly from raw waveforms and have shown…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-10 Ge Zhu , Fei Jiang , Zhiyao Duan
‹ Prev 1 2 3 10 Next ›