Related papers: Multi-Graph Decoding for Code-Switching ASR

Code-Switching Detection with Data-Augmented Acoustic and Language Models

In this paper, we investigate the code-switching detection performance of a code-switching (CS) automatic speech recognition (ASR) system with data-augmented acoustic and language models. We focus on the recognition of Frisian-Dutch radio…

Computation and Language · Computer Science 2018-08-03 Emre Yılmaz , Henk van den Heuvel , David A. van Leeuwen

Acoustic and Textual Data Augmentation for Improved ASR of Code-Switching Speech

In this paper, we describe several techniques for improving the acoustic and language model of an automatic speech recognition (ASR) system operating on code-switching (CS) speech. We focus on the recognition of Frisian-Dutch radio…

Computation and Language · Computer Science 2018-07-31 Emre Yılmaz , Henk van den Heuvel , David A. van Leeuwen

Semi-supervised acoustic model training for speech with code-switching

In the FAME! project, we aim to develop an automatic speech recognition (ASR) system for Frisian-Dutch code-switching (CS) speech extracted from the archives of a local broadcaster with the ultimate goal of building a spoken document…

Computation and Language · Computer Science 2018-10-24 Emre Yılmaz , Mitchell McLaren , Henk van den Heuvel , David A. van Leeuwen

End-to-End Code-Switching ASR for Low-Resourced Language Pairs

Despite the significant progress in end-to-end (E2E) automatic speech recognition (ASR), E2E ASR for low resourced code-switching (CS) speech has not been well studied. In this work, we describe an E2E ASR pipeline for the recognition of CS…

Computation and Language · Computer Science 2019-10-01 Xianghu Yue , Grandee Lee , Emre Yılmaz , Fang Deng , Haizhou Li

Arabic Code-Switching Speech Recognition using Monolingual Data

Code-switching in automatic speech recognition (ASR) is an important challenge due to globalization. Recent research in multilingual ASR shows potential improvement over monolingual systems. We study key issues related to multilingual…

Computation and Language · Computer Science 2021-07-06 Ahmed Ali , Shammur Chowdhury , Amir Hussein , Yasser Hifny

Towards Zero-Shot Code-Switched Speech Recognition

In this work, we seek to build effective code-switched (CS) automatic speech recognition systems (ASR) under the zero-shot setting where no transcribed CS speech data is available for training. Previously proposed frameworks which…

Computation and Language · Computer Science 2022-11-10 Brian Yan , Matthew Wiesner , Ondrej Klejch , Preethi Jyothi , Shinji Watanabe

Unified model for code-switching speech recognition and language identification based on a concatenated tokenizer

Code-Switching (CS) multilingual Automatic Speech Recognition (ASR) models can transcribe speech containing two or more alternating languages during a conversation. This paper proposes (1) a new method for creating code-switching ASR…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-19 Kunal Dhawan , Dima Rekesh , Boris Ginsburg

Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR

With the advent of globalization, there is an increasing demand for multilingual automatic speech recognition (ASR), handling language and dialectal variation of spoken content. Recent studies show its efficacy over monolingual systems. In…

Computation and Language · Computer Science 2021-07-06 Shammur Absar Chowdhury , Amir Hussein , Ahmed Abdelali , Ahmed Ali

Semi-supervised Learning for Code-Switching ASR with Large Language Model Filter

Code-switching (CS) phenomenon occurs when words or phrases from different languages are alternated in a single sentence. Due to data scarcity, building an effective CS Automatic Speech Recognition (ASR) system remains challenging. In this…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-23 Yu Xi , Wen Ding , Kai Yu , Junjie Lai

Multi-Encoder-Decoder Transformer for Code-Switching Speech Recognition

Code-switching (CS) occurs when a speaker alternates words of two or more languages within a single sentence or across sentences. Automatic speech recognition (ASR) of CS speech has to deal with two or more languages at the same time. In…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-19 Xinyuan Zhou , Emre Yılmaz , Yanhua Long , Yijie Li , Haizhou Li

Optimizing ASR for Catalan-Spanish Code-Switching: A Comparative Analysis of Methodologies

Code-switching (CS), the alternating use of two or more languages, challenges automatic speech recognition (ASR) due to scarce training data and linguistic similarities. The lack of dedicated CS datasets limits ASR performance, as most…

Computation and Language · Computer Science 2025-07-21 Carlos Mena , Pol Serra , Jacobo Romero , Abir Messaoudi , Jose Giraldo , Carme Armentano-Oller , Rodolfo Zevallos , Ivan Meza , Javier Hernando

Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding

Code-switching (CS) automatic speech recognition (ASR) faces challenges due to the language confusion resulting from accents, auditory similarity, and seamless language switches. Adaptation on the pre-trained multi-lingual model has shown…

Computation and Language · Computer Science 2025-01-07 Jiahui Zhao , Hao Shi , Chenrui Cui , Tianrui Wang , Hexin Liu , Zhaoheng Ni , Lingxuan Ye , Longbiao Wang

Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition

The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical…

Computation and Language · Computer Science 2023-01-12 Amir Hussein , Shammur Absar Chowdhury , Ahmed Abdelali , Najim Dehak , Ahmed Ali , Sanjeev Khudanpur

Code-Switching Detection Using ASR-Generated Language Posteriors

Code-switching (CS) detection refers to the automatic detection of language switches in code-mixed utterances. This task can be achieved by using a CS automatic speech recognition (ASR) system that can handle such language switches. In our…

Computation and Language · Computer Science 2019-06-20 Qinyi Wang , Emre Yılmaz , Adem Derinel , Haizhou Li

Evaluating Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource Performance

Automatic Speech Recognition (ASR) performance for low-resource languages is still far behind that of higher-resource languages such as English, due to a lack of sufficient labeled data. State-of-the-art methods deploy self-supervised…

Computation and Language · Computer Science 2025-02-10 Reihaneh Amooie , Wietse de Vries , Yun Hao , Jelske Dijkstra , Matt Coler , Martijn Wieling

Multilingual and code-switching ASR challenges for low resource Indian languages

Recently, there is increasing interest in multilingual automatic speech recognition (ASR) where a speech recognition system caters to multiple low resource languages by taking advantage of low amounts of labeled corpora in multiple…

Computation and Language · Computer Science 2021-09-21 Anuj Diwan , Rakesh Vaideeswaran , Sanket Shah , Ankita Singh , Srinivasa Raghavan , Shreya Khare , Vinit Unni , Saurabh Vyas , Akash Rajpuria , Chiranjeevi Yarra , Ashish Mittal , Prasanta Kumar Ghosh , Preethi Jyothi , Kalika Bali , Vivek Seshadri , Sunayana Sitaram , Samarth Bharadwaj , Jai Nanavati , Raoul Nanavati , Karthik Sankaranarayanan , Tejaswi Seeram , Basil Abraham

Code-switching Speech Recognition Under the Lens: Model- and Data-Centric Perspectives

Code-switching automatic speech recognition (CS-ASR) presents unique challenges due to language confusion introduced by spontaneous intra-sentence switching and accent bias that blurs the phonetic boundaries. Although the constituent…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-18 Hexin Liu , Haoyang Zhang , Qiquan Zhang , Xiangyu Zhang , Dongyuan Shi , Eng Siong Chng , Haizhou Li

Code Switched and Code Mixed Speech Recognition for Indic languages

Training multilingual automatic speech recognition (ASR) systems is challenging because acoustic and lexical information is typically language specific. Training multilingual system for Indic languages is even more tougher due to lack of…

Computation and Language · Computer Science 2022-06-14 Harveen Singh Chadha , Priyanshi Shah , Ankur Dhuriya , Neeraj Chhimwal , Anirudh Gupta , Vivek Raghavan

Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition

Modeling code-switched speech is an important problem in automatic speech recognition (ASR). Labeled code-switched data are rare, so monolingual data are often used to model code-switched speech. These monolingual data may be more closely…

Computation and Language · Computer Science 2021-06-16 Andrew Slottje , Shannon Wotherspoon , William Hartmann , Matthew Snover , Owen Kimball

Building a Unified Code-Switching ASR System for South African Languages

We present our first efforts towards building a single multilingual automatic speech recognition (ASR) system that can process code-switching (CS) speech in five languages spoken within the same population. This contrasts with related prior…

Computation and Language · Computer Science 2018-07-31 Emre Yılmaz , Astik Biswas , Ewald van der Westhuizen , Febe de Wet , Thomas Niesler