English
Related papers

Related papers: Rethinking CNN Models for Audio Classification

200 papers

This paper has proposed a new baseline deep learning model of more benefits for image classification. Different from the convolutional neural network(CNN) practice where filters are trained by back propagation to represent different…

Computer Vision and Pattern Recognition · Computer Science 2021-08-13 Yifei Li , Kuangyan Song , Yiming Sun , Liao Zhu

Convolutional Neural Networks (CNNs) have proven very effective in image classification and show promise for audio. We use various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with…

This study assesses deep learning models for audio classification in a clinical setting with the constraint of small datasets reflecting real-world prospective data collection. We analyze CNNs, including DenseNet and ConvNeXt, alongside…

Pattern recognition from audio signals is an active research topic encompassing audio tagging, acoustic scene classification, music classification, and other areas. Spectrogram and mel-frequency cepstral coefficients (MFCC) are among the…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-18 Md. Istiaq Ansari , Taufiq Hasan

End-to-end neural network based approaches to audio modelling are generally outperformed by models trained on high-level data representations. In this paper we present preliminary work that shows the feasibility of training the first layers…

Sound · Computer Science 2017-12-04 Tycho Max Sylvester Tax , Jose Luis Diez Antich , Hendrik Purwins , Lars Maaløe

The ability of deep convolutional neural networks (CNN) to learn discriminative spectro-temporal patterns makes them well suited to environmental sound classification. However, the relative scarcity of labeled data has impeded the…

Sound · Computer Science 2017-04-05 Justin Salamon , Juan Pablo Bello

Transfer learning is commonly employed to leverage large, pre-trained models and perform fine-tuning for downstream tasks. The most prevalent pre-trained models are initially trained using ImageNet. However, their ability to generalize can…

The computer vision literature shows that randomly weighted neural networks perform reasonably as feature extractors. Following this idea, we study how non-trained (randomly weighted) convolutional neural networks perform as feature…

Sound · Computer Science 2019-02-18 Jordi Pons , Xavier Serra

Image classification has significantly improved using deep learning. This is mainly due to convolutional neural networks (CNNs) that are capable of learning rich feature extractors from large datasets. However, most deep learning…

Computer Vision and Pattern Recognition · Computer Science 2021-10-06 Xiaoyu Lin , Deblina Bhattacharjee , Majed El Helou , Sabine Süsstrunk

We explore why deep convolutional neural networks (CNNs) with small two-dimensional kernels, primarily used for modeling spatial relations in images, are also effective in speech recognition. We analyze the representations learned by deep…

Computation and Language · Computer Science 2018-11-13 Joanna Rownicka , Peter Bell , Steve Renals

Convolutional Neural Networks (CNNs) do not have a predictable recognition behavior with respect to the input resolution change. This prevents the feasibility of deployment on different input image resolutions for a specific model. To…

Computer Vision and Pattern Recognition · Computer Science 2020-07-14 Duo Li , Anbang Yao , Qifeng Chen

Because large, human-annotated datasets suffer from labeling errors, it is crucial to be able to train deep neural networks in the presence of label noise. While training image classification models with label noise have received much…

Machine Learning · Computer Science 2019-03-19 Ishan Jindal , Daniel Pressel , Brian Lester , Matthew Nokleby

Over the past few years, audio classification task on large-scale dataset such as AudioSet has been an important research area. Several deeper Convolution-based Neural networks have shown compelling performance notably Vggish, YAMNet, and…

Sound · Computer Science 2023-05-23 Shwetank Choudhary , CR Karthik , Punuru Sri Lakshmi , Sumit Kumar

This paper presents our latest investigation on Densely Connected Convolutional Networks (DenseNets) for acoustic modelling (AM) in automatic speech recognition. DenseN-ets are very deep, compact convolutional neural networks, which have…

Computation and Language · Computer Science 2018-08-13 Chia Yu Li , Ngoc Thang Vu

In computer vision, convolutional neural networks (CNN) such as ConvNeXt, have been able to surpass state-of-the-art transformers, partly thanks to depthwise separable convolutions (DSC). DSC, as an approximation of the regular convolution,…

Recent progress in network-based audio event classification has shown the benefit of pre-training models on visual data such as ImageNet. While this process allows knowledge transfer across different domains, training a model on large-scale…

Sound · Computer Science 2021-05-21 Sascha Hornauer , Ke Li , Stella X. Yu , Shabnam Ghaffarzadegan , Liu Ren

Deep learning Convolutional Neural Network (CNN) models are powerful classification models but require a large amount of training data. In niche domains such as bird acoustics, it is expensive and difficult to obtain a large number of…

Computer Vision and Pattern Recognition · Computer Science 2019-09-18 Dina B. Efremova , Mangalam Sankupellay , Dmitry A. Konovalov

Deep neural network (DNN)-based models for environmental sound classification are not robust against a domain to which training data do not belong, that is, out-of-distribution or unseen data. To utilize pretrained models for the unseen…

Acoustic scene classification is an intricate problem for a machine. As an emerging field of research, deep Convolutional Neural Networks (CNN) achieve convincing results. In this paper, we explore the use of multi-scale Dense connected…

Computer Vision and Pattern Recognition · Computer Science 2018-06-13 Dawei Feng , Kele Xu , Haibo Mi , Feifan Liao , Yan Zhou

This study explores the design and application of Complex-Valued Convolutional Neural Networks (CVCNNs) in audio signal processing, with a focus on preserving and utilizing phase information often neglected in real-valued networks. We begin…

Machine Learning · Computer Science 2025-10-14 Naman Agrawal
‹ Prev 1 2 3 10 Next ›