English
Related papers

Related papers: Star Temporal Classification: Sequence Classificat…

200 papers

Semi-supervised learning has demonstrated promising results in automatic speech recognition (ASR) by self-training using a seed ASR model with pseudo-labels generated for unlabeled data. The effectiveness of this approach largely relies on…

Machine Learning · Computer Science 2021-02-17 Niko Moritz , Takaaki Hori , Jonathan Le Roux

Connectionist Temporal Classification has recently attracted a lot of interest as it offers an elegant approach to building acoustic models (AMs) for speech recognition. The CTC loss function maps an input sequence of observable feature…

Computation and Language · Computer Science 2017-08-16 Thomas Zenkel , Ramon Sanabria , Florian Metze , Jan Niehues , Matthias Sperber , Sebastian Stüker , Alex Waibel

Connectionist temporal classification (CTC) is commonly adopted for sequence modeling tasks like speech recognition, where it is necessary to preserve order between the input and target sequences. However, CTC is only applied to…

Machine Learning · Computer Science 2023-12-18 Zheng Nan , Ting Dang , Vidhyasaharan Sethu , Beena Ahmed

We report an extension of a Keras Model, called CTCModel, to perform the Connectionist Temporal Classification (CTC) in a transparent way. Combined with Recurrent Neural Networks, the Connectionist Temporal Classification is the reference…

Machine Learning · Computer Science 2019-01-24 Yann Soullard , Cyprien Ruffino , Thierry Paquet

Segmental conditional random fields (SCRFs) and connectionist temporal classification (CTC) are two sequence labeling methods used for end-to-end training of speech recognition models. Both models define a transcription probability by…

Computation and Language · Computer Science 2017-06-07 Liang Lu , Lingpeng Kong , Chris Dyer , Noah A. Smith

Semi-Supervised Text Classification (SSTC) mainly works under the spirit of self-training. They initialize the deep classifier by training over labeled texts; and then alternatively predict unlabeled texts as their pseudo-labels and train…

Machine Learning · Computer Science 2026-03-24 Changchun Li , Ximing Li , Bingjie Zhang , Wenting Wang , Jihong Ouyang

Connectionist temporal classification (CTC) has matured as an alignment free to sequence transduction and shows competitive for end-to-end speech recognition. In the CTC topology, the blank symbol occupies more than half of the state…

Computation and Language · Computer Science 2019-12-11 Taiyang Zhao

Text recognition methods are gaining rapid development. Some advanced techniques, e.g., powerful modules, language models, and un- and semi-supervised learning schemes, consecutively push the performance on public benchmarks forward.…

Computer Vision and Pattern Recognition · Computer Science 2024-01-01 Ziyin Zhang , Ning Lu , Minghui Liao , Yongshuai Huang , Cheng Li , Min Wang , Wei Peng

We present sparse topical coding (STC), a non-probabilistic formulation of topic models for discovering latent representations of large collections of data. Unlike probabilistic topic models, STC relaxes the normalization constraint of…

Machine Learning · Computer Science 2012-02-20 Jun Zhu , Eric P. Xing

Multi-label image recognition is a fundamental yet practical task because real-world images inherently possess multiple semantic labels. However, it is difficult to collect large-scale multi-label annotations due to the complexity of both…

Computer Vision and Pattern Recognition · Computer Science 2022-03-07 Tianshui Chen , Tao Pu , Hefeng Wu , Yuan Xie , Liang Lin

Connectionist temporal classification (CTC) is a popular sequence prediction approach for automatic speech recognition that is typically used with models based on recurrent neural networks (RNNs). We explore whether deep convolutional…

Computation and Language · Computer Science 2018-02-16 Kalpesh Krishna , Liang Lu , Kevin Gimpel , Karen Livescu

This paper presents a novel algorithm for building an automatic speech recognition (ASR) model with imperfect training data. Imperfectly transcribed speech is a prevalent issue in human-annotated speech corpora, which degrades the…

Computation and Language · Computer Science 2023-06-05 Dongji Gao , Matthew Wiesner , Hainan Xu , Leibny Paola Garcia , Daniel Povey , Sanjeev Khudanpur

Many machine learning problems such as speech recognition, gesture recognition, and handwriting recognition are concerned with simultaneous segmentation and labeling of sequence data. Latent-dynamic conditional random field (LDCRF) is a…

Machine Learning · Computer Science 2016-09-07 Amir Ahooye Atashin , Kamaledin Ghiasi-Shirazi , Ahad Harati

The success of self-attention in NLP has led to recent applications in end-to-end encoder-decoder architectures for speech recognition. Separately, connectionist temporal classification (CTC) has matured as an alignment-free,…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-02 Julian Salazar , Katrin Kirchhoff , Zhiheng Huang

Audio tagging aims to predict one or several labels in an audio clip. Many previous works use weakly labelled data (WLD) for audio tagging, where only presence or absence of sound events is known, but the order of sound events is unknown.…

Sound · Computer Science 2018-08-07 Yuanbo Hou , Qiuqiang Kong , Shengchen Li

Large language models (LLMs) have demonstrated remarkable capabilities across diverse tasks. However, the truthfulness of their outputs is not guaranteed, and their tendency toward overconfidence further limits reliability. Uncertainty…

Computation and Language · Computer Science 2026-03-23 Qi Cao , Andrew Gambardella , Takeshi Kojima , Yutaka Matsuo , Yusuke Iwasawa

Code-Switching (CS) remains a challenge for Automatic Speech Recognition (ASR), especially character-based models. With the combined choice of characters from multiple languages, the outcome from character-based models suffers from phoneme…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-24 Burin Naowarat , Thananchai Kongthaworn , Korrawe Karunratanakul , Sheng Hui Wu , Ekapol Chuangsuwanich

Stochastic finite automata arise naturally in many language and speech processing tasks. They include stochastic acceptors, which represent certain probability distributions over random strings. We consider the problem of efficient…

Computation and Language · Computer Science 2019-09-24 Martin Jansche , Alexander Gutkin

This research identifies a gap in weakly-labelled multivariate time-series classification (TSC), where state-of-the-art TSC models do not per-form well. Weakly labelled time-series are time-series containing noise and significant…

Machine Learning · Computer Science 2021-09-20 Surayez Rahman , Chang Wei Tan

This paper presents novel Weighted Finite-State Transducer (WFST) topologies to implement Connectionist Temporal Classification (CTC)-like algorithms for automatic speech recognition. Three new CTC variants are proposed: (1) the…

Audio and Speech Processing · Electrical Eng. & Systems 2022-09-27 Aleksandr Laptev , Somshubra Majumdar , Boris Ginsburg
‹ Prev 1 2 3 10 Next ›