Related papers: Star Temporal Classification: Sequence Classificat…

Semi-Supervised Speech Recognition via Graph-based Temporal Classification

Semi-supervised learning has demonstrated promising results in automatic speech recognition (ASR) by self-training using a seed ASR model with pseudo-labels generated for unlabeled data. The effectiveness of this approach largely relies on…

Machine Learning · Computer Science 2021-02-17 Niko Moritz , Takaaki Hori , Jonathan Le Roux

Comparison of Decoding Strategies for CTC Acoustic Models

Connectionist Temporal Classification has recently attracted a lot of interest as it offers an elegant approach to building acoustic models (AMs) for speech recognition. The CTC loss function maps an input sequence of observable feature…

Computation and Language · Computer Science 2017-08-16 Thomas Zenkel , Ramon Sanabria , Florian Metze , Jan Niehues , Matthias Sperber , Sebastian Stüker , Alex Waibel

Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling

Connectionist temporal classification (CTC) is commonly adopted for sequence modeling tasks like speech recognition, where it is necessary to preserve order between the input and target sequences. However, CTC is only applied to…

Machine Learning · Computer Science 2023-12-18 Zheng Nan , Ting Dang , Vidhyasaharan Sethu , Beena Ahmed

CTCModel: a Keras Model for Connectionist Temporal Classification

We report an extension of a Keras Model, called CTCModel, to perform the Connectionist Temporal Classification (CTC) in a transparent way. Combined with Recurrent Neural Networks, the Connectionist Temporal Classification is the reference…

Machine Learning · Computer Science 2019-01-24 Yann Soullard , Cyprien Ruffino , Thierry Paquet

Multitask Learning with CTC and Segmental CRF for Speech Recognition

Segmental conditional random fields (SCRFs) and connectionist temporal classification (CTC) are two sequence labeling methods used for end-to-end training of speech recognition models. Both models define a transcription probability by…

Computation and Language · Computer Science 2017-06-07 Liang Lu , Lingpeng Kong , Chris Dyer , Noah A. Smith

Semi-Supervised Learning with Balanced Deep Representation Distributions

Semi-Supervised Text Classification (SSTC) mainly works under the spirit of self-training. They initialize the deep classifier by training over labeled texts; and then alternatively predict unlabeled texts as their pseudo-labels and train…

Machine Learning · Computer Science 2026-03-24 Changchun Li , Ximing Li , Bingjie Zhang , Wenting Wang , Jihong Ouyang

A Novel Topology for End-to-end Temporal Classification and Segmentation with Recurrent Neural Network

Connectionist temporal classification (CTC) has matured as an alignment free to sequence transduction and shows competitive for end-to-end speech recognition. In the CTC topology, the blank symbol occupies more than half of the state…

Computation and Language · Computer Science 2019-12-11 Taiyang Zhao

Self-distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach

Text recognition methods are gaining rapid development. Some advanced techniques, e.g., powerful modules, language models, and un- and semi-supervised learning schemes, consecutively push the performance on public benchmarks forward.…

Computer Vision and Pattern Recognition · Computer Science 2024-01-01 Ziyin Zhang , Ning Lu , Minghui Liao , Yongshuai Huang , Cheng Li , Min Wang , Wei Peng

Sparse Topical Coding

We present sparse topical coding (STC), a non-probabilistic formulation of topic models for discovering latent representations of large collections of data. Unlike probabilistic topic models, STC relaxes the normalization constraint of…

Machine Learning · Computer Science 2012-02-20 Jun Zhu , Eric P. Xing

Structured Semantic Transfer for Multi-Label Recognition with Partial Labels

Multi-label image recognition is a fundamental yet practical task because real-world images inherently possess multiple semantic labels. However, it is difficult to collect large-scale multi-label annotations due to the complexity of both…

Computer Vision and Pattern Recognition · Computer Science 2022-03-07 Tianshui Chen , Tao Pu , Hefeng Wu , Yuan Xie , Liang Lin

A Study of All-Convolutional Encoders for Connectionist Temporal Classification

Connectionist temporal classification (CTC) is a popular sequence prediction approach for automatic speech recognition that is typically used with models based on recurrent neural networks (RNNs). We explore whether deep convolutional…

Computation and Language · Computer Science 2018-02-16 Kalpesh Krishna , Liang Lu , Kevin Gimpel , Karen Livescu

Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts

This paper presents a novel algorithm for building an automatic speech recognition (ASR) model with imperfect training data. Imperfectly transcribed speech is a prevalent issue in human-annotated speech corpora, which degrades the…

Computation and Language · Computer Science 2023-06-05 Dongji Gao , Matthew Wiesner , Hainan Xu , Leibny Paola Garcia , Daniel Povey , Sanjeev Khudanpur

Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification

Many machine learning problems such as speech recognition, gesture recognition, and handwriting recognition are concerned with simultaneous segmentation and labeling of sequence data. Latent-dynamic conditional random field (LDCRF) is a…

Machine Learning · Computer Science 2016-09-07 Amir Ahooye Atashin , Kamaledin Ghiasi-Shirazi , Ahad Harati

Self-Attention Networks for Connectionist Temporal Classification in Speech Recognition

The success of self-attention in NLP has led to recent applications in end-to-end encoder-decoder architectures for speech recognition. Separately, connectionist temporal classification (CTC) has matured as an alignment-free,…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-02 Julian Salazar , Katrin Kirchhoff , Zhiheng Huang

Audio Tagging With Connectionist Temporal Classification Model Using Sequential Labelled Data

Audio tagging aims to predict one or several labels in an audio clip. Many previous works use weakly labelled data (WLD) for audio tagging, where only presence or absence of sound events is known, but the order of sound events is unknown.…

Sound · Computer Science 2018-08-07 Yuanbo Hou , Qiuqiang Kong , Shengchen Li

Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models

Large language models (LLMs) have demonstrated remarkable capabilities across diverse tasks. However, the truthfulness of their outputs is not guaranteed, and their tendency toward overconfidence further limits reliability. Uncertainty…

Computation and Language · Computer Science 2026-03-23 Qi Cao , Andrew Gambardella , Takeshi Kojima , Yutaka Matsuo , Yusuke Iwasawa

Reducing Spelling Inconsistencies in Code-Switching ASR using Contextualized CTC Loss

Code-Switching (CS) remains a challenge for Automatic Speech Recognition (ASR), especially character-based models. With the combined choice of characters from multiple languages, the outcome from character-based models suffers from phoneme…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-24 Burin Naowarat , Thananchai Kongthaworn , Korrawe Karunratanakul , Sheng Hui Wu , Ekapol Chuangsuwanich

Sampling from Stochastic Finite Automata with Applications to CTC Decoding

Stochastic finite automata arise naturally in many language and speech processing tasks. They include stochastic acceptors, which represent certain probability distributions over random strings. We consider the problem of efficient…

Computation and Language · Computer Science 2019-09-24 Martin Jansche , Alexander Gutkin

Classification of multivariate weakly-labelled time-series with attention

This research identifies a gap in weakly-labelled multivariate time-series classification (TSC), where state-of-the-art TSC models do not per-form well. Weakly labelled time-series are time-series containing noise and significant…

Machine Learning · Computer Science 2021-09-20 Surayez Rahman , Chang Wei Tan

CTC Variations Through New WFST Topologies

This paper presents novel Weighted Finite-State Transducer (WFST) topologies to implement Connectionist Temporal Classification (CTC)-like algorithms for automatic speech recognition. Three new CTC variants are proposed: (1) the…

Audio and Speech Processing · Electrical Eng. & Systems 2022-09-27 Aleksandr Laptev , Somshubra Majumdar , Boris Ginsburg