English
Related papers

Related papers: LEAN: Light and Efficient Audio Classification Net…

200 papers

Audio pattern recognition is an important research topic in the machine learning area, and includes several tasks such as audio tagging, acoustic scene classification, music classification, speech emotion classification and sound event…

Sound · Computer Science 2020-08-25 Qiuqiang Kong , Yin Cao , Turab Iqbal , Yuxuan Wang , Wenwu Wang , Mark D. Plumbley

In this paper, we show that ImageNet-Pretrained standard deep CNN models can be used as strong baseline networks for audio classification. Even though there is a significant difference between audio Spectrogram and standard ImageNet image…

Computer Vision and Pattern Recognition · Computer Science 2020-11-17 Kamalesh Palanisamy , Dipika Singhania , Angela Yao

Speech enhancement is a task to improve the intelligibility and perceptual quality of degraded speech signal. Recently, neural networks based methods have been applied to speech enhancement. However, many neural network based methods…

Sound · Computer Science 2021-02-22 Qiuqiang Kong , Haohe Liu , Xingjian Du , Li Chen , Rui Xia , Yuxuan Wang

In audio classification, differentiable auditory filterbanks with few parameters cover the middle ground between hard-coded spectrograms and raw audio. LEAF (arXiv:2101.08596), a Gabor-based filterbank combined with Per-Channel Energy…

Sound · Computer Science 2022-07-13 Jan Schlüter , Gerald Gutenbrunner

In computer vision, convolutional neural networks (CNN) such as ConvNeXt, have been able to surpass state-of-the-art transformers, partly thanks to depthwise separable convolutions (DSC). DSC, as an approximation of the regular convolution,…

Mel-filterbanks are fixed, engineered audio features which emulate human perception and have been used through the history of audio understanding up to today. However, their undeniable qualities are counterbalanced by the fundamental…

Sound · Computer Science 2021-01-22 Neil Zeghidour , Olivier Teboul , Félix de Chaumont Quitry , Marco Tagliasacchi

This study assesses deep learning models for audio classification in a clinical setting with the constraint of small datasets reflecting real-world prospective data collection. We analyze CNNs, including DenseNet and ConvNeXt, alongside…

Sounds carry an abundance of information about activities and events in our everyday environment, such as traffic noise, road works, music, or people talking. Recent machine learning methods, such as convolutional neural networks (CNNs),…

Sound · Computer Science 2023-05-31 Arshdeep Singh , Haohe Liu , Mark D. Plumbley

Audio tagging is the task of predicting the presence or absence of sound classes within an audio clip. Previous work in audio tagging focused on relatively small datasets limited to recognising a small number of sound classes. We…

Sound · Computer Science 2019-12-11 Qiuqiang Kong , Changsong Yu , Turab Iqbal , Yong Xu , Wenwu Wang , Mark D. Plumbley

This work addresses the need for enhanced accuracy and efficiency in speech command recognition systems, a critical component for improving user interaction in various smart applications. Leveraging the robust pretrained YAMNet model and…

Sound · Computer Science 2025-04-29 Sidahmed Lachenani , Hamza Kheddar , Mohamed Ouldzmirli

In this paper, we propose a multi-level attention model to solve the weakly labelled audio classification problem. The objective of audio classification is to predict the presence or absence of audio events in an audio clip. Recently,…

Audio and Speech Processing · Electrical Eng. & Systems 2018-03-08 Changsong Yu , Karim Said Barsim , Qiuqiang Kong , Bin Yang

Audio pattern recognition (APR) is an important research topic and can be applied to several fields related to our lives. Therefore, accurate and efficient APR systems need to be developed as they are useful in real applications. In this…

Sound · Computer Science 2022-07-21 Sergey Verbitskiy , Vladimir Berikov , Viacheslav Vyshegorodtsev

Recognizing sounds is a key aspect of computational audio scene analysis and machine perception. In this paper, we advocate that sound recognition is inherently a multi-modal audiovisual task in that it is easier to differentiate sounds…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-03 Haytham M. Fayek , Anurag Kumar

Audio classification is the task of identifying the sound categories that are associated with a given audio signal. This paper presents an investigation on large-scale audio classification based on the recently released AudioSet database.…

Sound · Computer Science 2018-10-31 Yuzhong Wu , Tan Lee

After its sweeping success in vision and language tasks, pure attention-based neural architectures (e.g. DeiT) are emerging to the top of audio tagging (AT) leaderboards, which seemingly obsoletes traditional convolutional neural networks…

Sound · Computer Science 2022-08-25 Juncheng B Li , Shuhui Qu , Po-Yao Huang , Florian Metze

Deep audio classification, traditionally cast as training a deep neural network on top of mel-filterbanks in a supervised fashion, has recently benefited from two independent lines of work. The first one explores "learnable frontends",…

Sound · Computer Science 2022-03-30 Sarthak Yadav , Neil Zeghidour

Audio classification aims at recognizing audio signals, including speech commands or sound events. However, current audio classifiers are susceptible to perturbations and adversarial attacks. In addition, real-world audio classification…

Sound · Computer Science 2024-03-28 Sayanton V. Dibbo , Juston S. Moore , Garrett T. Kenyon , Michael A. Teti

We propose a method that quantifies the importance, namely relevance, of audio segments for classification in weakly-labelled problems. It works by drawing information from a set of class-wise one-vs-all classifiers. By selecting the…

Audio and Speech Processing · Electrical Eng. & Systems 2019-11-13 Juliano Henrique Foleiss , Tiago Fernandes Tavares

Audio tagging is an active research area and has a wide range of applications. Since the release of AudioSet, great progress has been made in advancing model performance, which mostly comes from the development of novel model architectures…

Sound · Computer Science 2021-11-18 Yuan Gong , Yu-An Chung , James Glass

Convolutional Neural Networks (CNNs) have proven very effective in image classification and show promise for audio. We use various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with…

‹ Prev 1 2 3 10 Next ›