English
Related papers

Related papers: General-Purpose Speech Representation Learning thr…

200 papers

Representation learning plays a critical role in the analysis of time series data and has high practical value across a wide range of applications. including trend analysis, time series data retrieval and forecasting. In practice, data…

Machine Learning · Computer Science 2023-12-13 Chengyang Ye , Qiang Ma

A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on large amounts of text in an unsupervised manner. These representations are typically used as general…

Computation and Language · Computer Science 2018-04-03 Sandeep Subramanian , Adam Trischler , Yoshua Bengio , Christopher J Pal

This paper proposes a novel unsupervised autoregressive neural model for learning generic speech representations. In contrast to other speech representation learning methods that aim to remove noise or speaker variabilities, ours is…

Computation and Language · Computer Science 2019-06-20 Yu-An Chung , Wei-Ning Hsu , Hao Tang , James Glass

We propose a self-supervised learning method using multiple sampling strategies to obtain general-purpose audio representation. Multiple sampling strategies are used in the proposed method to construct contrastive losses from different…

Sound · Computer Science 2025-05-27 Ibuki Kuroyanagi , Tatsuya Komatsu

Single-channel speech enhancement is utilized in various tasks to mitigate the effect of interfering signals. Conventionally, to ensure the speech enhancement performs optimally, the speech enhancement has needed to be tuned for each task.…

Audio and Speech Processing · Electrical Eng. & Systems 2025-07-11 Hiroshi Sato , Tsubasa Ochiai , Marc Delcroix , Takafumi Moriya , Takanori Ashihara , Ryo Masumura

Speech foundation models trained with self-supervised learning produce generic speech representations that support a wide range of speech processing tasks. When further adapted with supervised learning, these models can achieve strong…

Computation and Language · Computer Science 2026-03-10 Maryem Bouziane , Salima Mdhaffar , Yannick Estève

Multilingual automatic speech recognition (ASR) systems have garnered attention for their potential to extend language coverage globally. While self-supervised learning (SSL) models, like MMS, have demonstrated their effectiveness in…

Computation and Language · Computer Science 2024-04-30 Hongfei Xue , Qijie Shao , Kaixun Huang , Peikun Chen , Jie Liu , Lei Xie

For enhancing noisy signals, machine-learning based single-channel speech enhancement schemes exploit prior knowledge about typical speech spectral structures. To ensure a good generalization and to meet requirements in terms of…

Sound · Computer Science 2018-01-17 Robert Rehr , Timo Gerkmann

Self-supervised learning enables the training of large neural models without the need for large, labeled datasets. It has been generating breakthroughs in several fields, including computer vision, natural language processing, biology, and…

Computation and Language · Computer Science 2023-12-19 Luis Lugo , Valentin Vielzeuf

Multiscale feature hierarchies have been witnessed the success in the computer vision area. This further motivates researchers to design multiscale Transformer for natural language processing, mostly based on the self-attention mechanism.…

Computation and Language · Computer Science 2022-06-22 Bei Li , Tong Zheng , Yi Jing , Chengbo Jiao , Tong Xiao , Jingbo Zhu

Although supervised deep learning has revolutionized speech and audio processing, it has necessitated the building of specialist models for individual tasks and application scenarios. It is likewise difficult to apply this to dialects and…

Self-supervised learning (SSL) models have achieved considerable improvements in automatic speech recognition (ASR). In addition, ASR performance could be further improved if the model is dedicated to audio content information learning…

Audio and Speech Processing · Electrical Eng. & Systems 2022-12-08 Genshun Wan , Tan Liu , Hang Chen , Jia Pan , Cong Liu , Zhongfu Ye

Universal speech enhancement aims at handling inputs with various speech distortions and recording conditions. In this work, we propose a novel hybrid architecture that synergizes the signal fidelity of discriminative modeling with the…

Sound · Computer Science 2026-01-28 Yinghao Liu , Chengwei Liu , Xiaotao Liang , Haoyin Yan , Shaofei Xue , Zheng Xue

Multi-task learning is a popular machine learning approach that enables simultaneous learning of multiple related tasks, improving algorithmic efficiency and effectiveness. In the hard parameter sharing approach, an encoder shared through…

Machine Learning · Statistics 2024-09-26 Seokwon Shin , Hyungrok Do , Youngdoo Son

Existing studies on self-supervised speech representation learning have focused on developing new training methods and applying pre-trained models for different applications. However, the quality of these models is often measured by the…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-18 Alexander H. Liu , Sung-Lin Yeh , James Glass

Recovering the masked speech frames is widely applied in speech representation learning. However, most of these models use random masking in the pre-training. In this work, we proposed two kinds of masking approaches: (1) speech-level…

Sound · Computer Science 2022-10-26 Xulong Zhang , Jianzong Wang , Ning Cheng , Kexin Zhu , Jing Xiao

Sustainable artificial intelligence focuses on data, hardware, and algorithms to make machine learning models more environmentally responsible. In particular, machine learning models for speech representations are computationally expensive,…

Computation and Language · Computer Science 2024-06-13 Luis Lugo , Valentin Vielzeuf

Speech representation and modelling in high-dimensional spaces of acoustic waveforms, or a linear transformation thereof, is investigated with the aim of improving the robustness of automatic speech recognition to additive noise. The…

Computation and Language · Computer Science 2015-03-31 Matthew Ager , Zoran Cvetkovic , Peter Sollich

Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images. In this paper, motivated by multi-task learning of shareable feature representations, we consider…

Computer Vision and Pattern Recognition · Computer Science 2021-06-28 Zhipeng Bao , Martial Hebert , Yu-Xiong Wang

Deep neural networks trained on Functional Connectivity (FC) networks extracted from functional Magnetic Resonance Imaging (fMRI) data have gained popularity due to the increasing availability of data and advances in model architectures,…

Machine Learning · Computer Science 2023-12-05 Jungwon Choi , Seongho Keum , EungGu Yun , Byung-Hoon Kim , Juho Lee
‹ Prev 1 2 3 10 Next ›