Related papers: Integrated speech and morphological processing in …

Phoneme-level speech and natural language intergration for agglutinative languages

A new tightly coupled speech and natural language integration model is presented for a TDNN-based large vocabulary continuous speech recognition system. Unlike the popular n-best techniques developed for integrating mainly HMM-based speech…

cmp-lg · Computer Science 2008-02-03 Geunbae Lee Jong-Hyeok Lee Kyunghee Kim

SKOPE: A connectionist/symbolic architecture of spoken Korean processing

Spoken language processing requires speech and natural language integration. Moreover, spoken Korean calls for unique processing methodology due to its linguistic characteristics. This paper presents SKOPE, a connectionist/symbolic spoken…

cmp-lg · Computer Science 2008-02-03 Geunbae Lee , Jong-Hyeok Lee

Chart-driven Connectionist Categorial Parsing of Spoken Korean

While most of the speech and natural language systems which were developed for English and other Indo-European languages neglect the morphological processing and integrate speech and natural language at the word level, for the agglutinative…

cmp-lg · Computer Science 2008-02-03 WonIl Lee , Geunbae Lee , Jong-Hyeok Lee

Phonological modeling for continuous speech recognition in Korean

A new scheme to represent phonological changes during continuous speech recognition is suggested. A phonological tag coupled with its morphological tag is designed to represent the conditions of Korean phonological changes. A pairwise…

cmp-lg · Computer Science 2008-02-03 WonIl Lee , Geunbae Lee , Jong-Hyeok Lee

A Syllable-based Technique for Word Embeddings of Korean Words

Word embedding has become a fundamental component to many NLP tasks such as named entity recognition and machine translation. However, popular models that learn such embeddings are unaware of the morphology of words, so it is not directly…

Computation and Language · Computer Science 2017-08-08 Sanghyuk Choi , Taeuk Kim , Jinseok Seol , Sang-goo Lee

Handling Korean Out-of-Vocabulary Words with Phoneme Representation Learning

In this study, we introduce KOPL, a novel framework for handling Korean OOV words with Phoneme representation Learning. Our work is based on the linguistic property of Korean as a phonemic script, the high correlation between phonemes and…

Computation and Language · Computer Science 2025-07-08 Nayeon Kim , Eojin Jeon , Jun-Hyung Park , SangKeun Lee

A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network

In this paper, a time delay neural network (TDNN) based acoustic model is proposed to implement a fast-converged acoustic modeling for Korean speech recognition. The TDNN has an advantage in fast-convergence where the amount of training…

Computation and Language · Computer Science 2018-07-17 Hosung Park , Donghyun Lee , Minkyu Lim , Yoseb Kang , Juneseok Oh , Ji-Hwan Kim

Integrating HMM-Based Speech Recognition With Direct Manipulation In A Multimodal Korean Natural Language Interface

This paper presents a HMM-based speech recognition engine and its integration into direct manipulation interfaces for Korean document editor. Speech recognition can reduce typical tedious and repetitive actions which are inevitable in…

cmp-lg · Computer Science 2016-08-31 Geunbae Lee , Jong-Hyeok Lee , Sangeok Kim

Improving Korean NLP Tasks with Linguistically Informed Subword Tokenization and Sub-character Decomposition

We introduce a morpheme-aware subword tokenization method that utilizes sub-character decomposition to address the challenges of applying Byte Pair Encoding (BPE) to Korean, a language characterized by its rich morphology and unique writing…

Computation and Language · Computer Science 2023-11-08 Taehee Jeon , Bongseok Yang , Changhwan Kim , Yoonseob Lim

DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis

This paper proposes novel algorithms for speaker embedding using subjective inter-speaker similarity based on deep neural networks (DNNs). Although conventional DNN-based speaker embedding such as a $d$-vector can be applied to…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-22 Yuki Saito , Shinnosuke Takamichi , Hiroshi Saruwatari

Learning pronunciation from a foreign language in speech synthesis networks

Although there are more than 6,500 languages in the world, the pronunciations of many phonemes sound similar across the languages. When people learn a foreign language, their pronunciation often reflects their native language's…

Computation and Language · Computer Science 2020-06-25 Younggun Lee , Suwon Shon , Taesu Kim

Rich Character-Level Information for Korean Morphological Analysis and Part-of-Speech Tagging

Due to the fact that Korean is a highly agglutinative, character-rich language, previous work on Korean morphological analysis typically employs the use of sub-character features known as graphemes or otherwise utilizes comprehensive prior…

Computation and Language · Computer Science 2018-06-29 Andrew Matteson , Chanhee Lee , Young-Bum Kim , Heuiseok Lim

Morphological annotation of Korean with Directly Maintainable Resources

This article describes an exclusively resource-based method of morphological annotation of written Korean text. Korean is an agglutinative language. Our annotator is designed to process text before the operation of a syntactic parser. In…

Computation and Language · Computer Science 2007-11-22 Ivan Berlocher , Hyun-Gue Huh , Eric Laporte , Jee-Sun Nam

Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition

The limited availability of non-native speech datasets presents a major challenge in automatic speech recognition (ASR) to narrow the performance gap between native and non-native speakers. To address this, the focus of this study is on the…

Computation and Language · Computer Science 2023-06-06 Jisung Wang , Haram Lee , Myungwoo Oh

Foundations and Evaluations in NLP

This memoir explores two fundamental aspects of Natural Language Processing (NLP): the creation of linguistic resources and the evaluation of NLP system performance. Over the past decade, my work has focused on developing a morpheme-based…

Computation and Language · Computer Science 2026-02-16 Jungyeul Park

Integrated Eojeol Embedding for Erroneous Sentence Classification in Korean Chatbots

This paper attempts to analyze the Korean sentence classification system for a chatbot. Sentence classification is the task of classifying an input sentence based on predefined categories. However, spelling or space error contained in the…

Computation and Language · Computer Science 2021-06-08 DongHyun Choi , IlNam Park , Myeong Cheol Shin , EungGyun Kim , Dong Ryeol Shin

Deep Speech: Scaling up end-to-end speech recognition

We present a state-of-the-art speech recognition system developed using end-to-end deep learning. Our architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these…

Computation and Language · Computer Science 2014-12-23 Awni Hannun , Carl Case , Jared Casper , Bryan Catanzaro , Greg Diamos , Erich Elsen , Ryan Prenger , Sanjeev Satheesh , Shubho Sengupta , Adam Coates , Andrew Y. Ng

Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models

This report introduces \texttt{EEVE-Korean-v1.0}, a Korean adaptation of large language models that exhibit remarkable capabilities across English and Korean text understanding. Building on recent highly capable but English-centric LLMs,…

Computation and Language · Computer Science 2024-02-23 Seungduk Kim , Seungtaek Choi , Myeongho Jeong

Korean Named Entity Recognition Based on Language-Specific Features

In the paper, we propose a novel way of improving named entity recognition in the Korean language using its language-specific features. While the field of named entity recognition has been studied extensively in recent years, the mechanism…

Computation and Language · Computer Science 2024-05-15 Yige Chen , KyungTae Lim , Jungyeul Park

Early Stage LM Integration Using Local and Global Log-Linear Combination

Sequence-to-sequence models with an implicit alignment mechanism (e.g. attention) are closing the performance gap towards traditional hybrid hidden Markov models (HMM) for the task of automatic speech recognition. One important factor to…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-21 Wilfried Michel , Ralf Schlüter , Hermann Ney