English
Related papers

Related papers: Dynamic Multi-scale Convolution for Dialect Identi…

200 papers

Convolutional neural networks (CNNs), such as the time-delay neural network (TDNN), have shown their remarkable capability in learning speaker embedding. However, they meanwhile bring a huge computational cost in storage size, processing,…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-22 Rui Wang , Zhihua Wei , Haoran Duan , Shouling Ji , Yang Long , Zhen Hong

Light-weight convolutional neural networks (CNNs) suffer performance degradation as their low computational budgets constrain both the depth (number of convolution layers) and the width (number of channels) of CNNs, resulting in limited…

Computer Vision and Pattern Recognition · Computer Science 2020-04-02 Yinpeng Chen , Xiyang Dai , Mengchen Liu , Dongdong Chen , Lu Yuan , Zicheng Liu

Dynamic convolution enhances model capacity by adaptively combining multiple kernels, yet faces critical trade-offs: prior works either (1) incur significant parameter overhead by scaling kernel numbers linearly, (2) compromise inference…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Haiduo Huang , Yadong Zhang , Yinghui Xu , Pengju Ren

In image denoising networks, feature scaling is widely used to enlarge the receptive field size and reduce computational costs. This practice, however, also leads to the loss of high-frequency information and fails to consider within-scale…

Computer Vision and Pattern Recognition · Computer Science 2023-04-04 Hao Shen , Zhong-Qiu Zhao , Wandi Zhang

A number of recent studies have shown that a Deep Convolutional Neural Network (DCNN) pretrained on a large dataset can be adopted as a universal image description which leads to astounding performance in many visual classification tasks.…

Computer Vision and Pattern Recognition · Computer Science 2014-12-01 Lingqiao Liu , Chunhua Shen , Anton van den Hengel

State-of-the-art speaker verification frameworks have typically focused on developing models with increasingly deeper (more layers) and wider (number of channels) models to improve their verification performance. Instead, this paper…

Sound · Computer Science 2023-02-28 Anna Ollerenshaw , Md Asif Jalal , Thomas Hain

Deep Convolutional Neural Networks (DCNNs) commonly use generic `max-pooling' (MP) layers to extract deformation-invariant features, but we argue in favor of a more refined treatment. First, we introduce epitomic convolution as a building…

Computer Vision and Pattern Recognition · Computer Science 2014-12-02 George Papandreou , Iasonas Kokkinos , Pierre-André Savalle

Hierarchical transformers have achieved significant success in medical image segmentation due to their large receptive field and capabilities of effectively leveraging global long-range contextual information. Convolutional neural networks…

Image and Video Processing · Electrical Eng. & Systems 2024-10-18 Jin Yang , Peijie Qiu , Yichi Zhang , Daniel S. Marcus , Aristeidis Sotiras

This paper presents a novel Dialect Identification (DID) system developed for the Fifth Edition of the Multi-Genre Broadcast challenge, the task of Fine-grained Arabic Dialect Identification (MGB-5 ADI Challenge). The system improves upon…

Audio and Speech Processing · Electrical Eng. & Systems 2019-12-20 Xiaoxiao Miao , Ian McLoughlin

Tasks that involve high-resolution dense prediction require a modeling of both local and global patterns in a large input field. Although the local and global structures often depend on each other and their simultaneous modeling is…

Computer Vision and Pattern Recognition · Computer Science 2021-06-10 Naoya Takahashi , Yuki Mitsufuji

Both the Dictionary Learning (DL) and Convolutional Neural Networks (CNN) are powerful image representation learning systems based on different mechanisms and principles, however whether we can seamlessly integrate them to improve the…

Computer Vision and Pattern Recognition · Computer Science 2020-01-16 Zhao Zhang , Yulin Sun , Yang Wang , Zhengjun Zha , Shuicheng Yan , Meng Wang

Deep learning techniques have become prominent in modern fault diagnosis for complex processes. In particular, convolutional neural networks (CNNs) have shown an appealing capacity to deal with multivariate time-series data by converting…

Computer Vision and Pattern Recognition · Computer Science 2022-10-04 Saif S. S. Al-Wahaibi , Qiugang Lu

The ability to accurately represent sentences is central to language understanding. We describe a convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) that we adopt for the semantic modelling of sentences. The…

Computation and Language · Computer Science 2014-04-09 Nal Kalchbrenner , Edward Grefenstette , Phil Blunsom

In speaker verification, traditional models often emphasize modeling long-term contextual features to capture global speaker characteristics. However, this approach can neglect fine-grained voiceprint information, which contains highly…

Sound · Computer Science 2025-05-07 Ya Li , Bin Zhou , Bo Hu

Frequency dynamic convolution (FDY conv) has shown the state-of-the-art performance in sound event detection (SED) using frequency-adaptive kernels obtained by frequency-varying combination of basis kernels. However, FDY conv lacks an…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-11 Hyeonuk Nam , Seong-Hu Kim , Deokki Min , Junhyeok Lee , Yong-Hwa Park

The Deep Convolutional Neural Networks (CNNs) have obtained a great success for pattern recognition, such as recognizing the texts in images. But existing CNNs based frameworks still have several drawbacks: 1) the traditaional pooling…

Computer Vision and Pattern Recognition · Computer Science 2020-01-20 Zhao Zhang , Zemin Tang , Zheng Zhang , Yang Wang , Jie Qin , Meng Wang

Recent research in deep learning-based Sound Event Detection (SED) has primarily focused on Convolutional Recurrent Neural Networks (CRNNs) and Transformer models. However, conventional 2D convolution-based models assume shift invariance…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-17 Hyeonuk Nam

Motivated by the fact that characteristics of different sound classes are highly diverse in different temporal scales and hierarchical levels, a novel deep convolutional neural network (CNN) architecture is proposed for the environmental…

Sound · Computer Science 2018-06-15 Boqing Zhu , Kele Xu , Dezhi Wang , Lilun Zhang , Bo Li , Yuxing Peng

Recent studies have shown that a Deep Convolutional Neural Network (DCNN) pretrained on a large image dataset can be used as a universal image descriptor, and that doing so leads to impressive performance for a variety of image…

Computer Vision and Pattern Recognition · Computer Science 2016-12-23 Lingqiao Liu , Chunhua Shen , Anton van den Hengel

Deep learning-based speech enhancement methods have significantly improved speech quality and intelligibility. Convolutional neural networks (CNNs) have been proven to be essential components of many high-performance models. In this paper,…

Audio and Speech Processing · Electrical Eng. & Systems 2025-11-11 Dahan Wang , Xiaobin Rong , Shiruo Sun , Yuxiang Hu , Changbao Zhu , Jing Lu
‹ Prev 1 2 3 10 Next ›