Related papers: A resource-based Korean morphological annotation s…

Morphological annotation of Korean with Directly Maintainable Resources

This article describes an exclusively resource-based method of morphological annotation of written Korean text. Korean is an agglutinative language. Our annotator is designed to process text before the operation of a syntactic parser. In…

Computation and Language · Computer Science 2007-11-22 Ivan Berlocher , Hyun-Gue Huh , Eric Laporte , Jee-Sun Nam

Extracting Arguments from Korean Question and Command: An Annotated Corpus for Structured Paraphrasing

Intention identification is a core issue in dialog management. However, due to the non-canonicality of the spoken language, it is difficult to extract the content automatically from the conversation-style utterances. This is much more…

Computation and Language · Computer Science 2019-07-10 Won Ik Cho , Young Ki Moon , Woo Hyun Kang , Nam Soo Kim

Rich Character-Level Information for Korean Morphological Analysis and Part-of-Speech Tagging

Due to the fact that Korean is a highly agglutinative, character-rich language, previous work on Korean morphological analysis typically employs the use of sub-character features known as graphemes or otherwise utilizes comprehensive prior…

Computation and Language · Computer Science 2018-06-29 Andrew Matteson , Chanhee Lee , Young-Bum Kim , Heuiseok Lim

Chart-driven Connectionist Categorial Parsing of Spoken Korean

While most of the speech and natural language systems which were developed for English and other Indo-European languages neglect the morphological processing and integrate speech and natural language at the word level, for the agglutinative…

cmp-lg · Computer Science 2008-02-03 WonIl Lee , Geunbae Lee , Jong-Hyeok Lee

Yet Another Format of Universal Dependencies for Korean

In this study, we propose a morpheme-based scheme for Korean dependency parsing and adopt the proposed scheme to Universal Dependencies. We present the linguistic rationale that illustrates the motivation and the necessity of adopting the…

Computation and Language · Computer Science 2022-09-21 Yige Chen , Eunkyul Leah Jo , Yundong Yao , KyungTae Lim , Miikka Silfverberg , Francis M. Tyers , Jungyeul Park

A Syllable-based Technique for Word Embeddings of Korean Words

Word embedding has become a fundamental component to many NLP tasks such as named entity recognition and machine translation. However, popular models that learn such embeddings are unaware of the morphology of words, so it is not directly…

Computation and Language · Computer Science 2017-08-08 Sanghyuk Choi , Taeuk Kim , Jinseok Seol , Sang-goo Lee

Morphological Processing of Low-Resource Languages: Where We Are and What's Next

Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages. Having long been multilingual, the…

Computation and Language · Computer Science 2022-03-18 Adam Wiemerslage , Miikka Silfverberg , Changbing Yang , Arya D. McCarthy , Garrett Nicolai , Eliana Colunga , Katharina Kann

Korean Named Entity Recognition Based on Language-Specific Features

In the paper, we propose a novel way of improving named entity recognition in the Korean language using its language-specific features. While the field of named entity recognition has been studied extensively in recent years, the mechanism…

Computation and Language · Computer Science 2024-05-15 Yige Chen , KyungTae Lim , Jungyeul Park

An Annotation Scheme for Free Word Order Languages

We describe an annotation scheme and a tool developed for creating linguistically annotated corpora for non-configurational languages. Since the requirements for such a formalism differ from those posited for configurational languages,…

cmp-lg · Computer Science 2008-02-03 Wojciech Skut , Brigitte Krenn , Thorsten Brants , Hans Uszkoreit

K-UniMorph: Korean Universal Morphology and its Feature Schema

We present in this work a new Universal Morphology dataset for Korean. Previously, the Korean language has been underrepresented in the field of morphological paradigms amongst hundreds of diverse world languages. Hence, we propose this…

Computation and Language · Computer Science 2023-05-18 Eunkyul Leah Jo , Kyuwon Kim , Xihan Wu , KyungTae Lim , Jungyeul Park , Chulwoo Park

Enhancing Korean Dependency Parsing with Morphosyntactic Features

This paper introduces UniDive for Korean, an integrated framework that bridges Universal Dependencies (UD) and Universal Morphology (UniMorph) to enhance the representation and processing of Korean {morphosyntax}. Korean's rich inflectional…

Computation and Language · Computer Science 2025-03-28 Jungyeul Park , Yige Chen , Kyuwon Kim , KyungTae Lim , Chulwoo Park

Phonological modeling for continuous speech recognition in Korean

A new scheme to represent phonological changes during continuous speech recognition is suggested. A phonological tag coupled with its morphological tag is designed to represent the conditions of Korean phonological changes. A pairwise…

cmp-lg · Computer Science 2008-02-03 WonIl Lee , Geunbae Lee , Jong-Hyeok Lee

A Straightforward Approach to Morphological Analysis and Synthesis

In this paper we present a lexicon-based approach to the problem of morphological processing. Full-form words, lemmas and grammatical tags are interconnected in a DAWG. Thus, the process of analysis/synthesis is reduced to a search in the…

Computation and Language · Computer Science 2007-05-23 Kyriakos N. Sgarbas , Nikos D. Fakotakis , George K. Kokkinakis

Word segmentation granularity in Korean

This paper describes word {segmentation} granularity in Korean language processing. From a word separated by blank space, which is termed an eojeol, to a sequence of morphemes in Korean, there are multiple possible levels of word…

Computation and Language · Computer Science 2023-09-08 Jungyeul Park , Mija Kim

Foundations and Evaluations in NLP

This memoir explores two fundamental aspects of Natural Language Processing (NLP): the creation of linguistic resources and the evaluation of NLP system performance. Over the past decade, my work has focused on developing a morpheme-based…

Computation and Language · Computer Science 2026-02-16 Jungyeul Park

Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean

A new tightly coupled speech and natural language integration model is presented for a TDNN-based continuous possibly large vocabulary speech recognition system for Korean. Unlike popular n-best techniques developed for integrating mainly…

cmp-lg · Computer Science 2008-02-03 Geunbae Lee , Jong-Hyeok Lee

Giving Space to Your Message: Assistive Word Segmentation for the Electronic Typing of Digital Minorities

For readability and disambiguation of the written text, appropriate word segmentation is recommended for documentation, and it also holds for the digitized texts. If the language is agglutinative while far from scriptio continua, for…

Computation and Language · Computer Science 2021-05-05 Won Ik Cho , Sung Jun Cheon , Woo Hyun Kang , Ji Won Kim , Nam Soo Kim

Reference and Document Aware Semantic Evaluation Methods for Korean Language Summarization

Text summarization refers to the process that generates a shorter form of text from the source document preserving salient information. Many existing works for text summarization are generally evaluated by using recall-oriented understudy…

Computation and Language · Computer Science 2020-11-03 Dongyub Lee , Myeongcheol Shin , Taesun Whang , Seungwoo Cho , Byeongil Ko , Daniel Lee , Eunggyun Kim , Jaechoon Jo

Knowledge Graph-Augmented Korean Generative Commonsense Reasoning

Generative commonsense reasoning refers to the task of generating acceptable and logical assumptions about everyday situations based on commonsense understanding. By utilizing an existing dataset such as Korean CommonGen, language…

Computation and Language · Computer Science 2023-06-27 Dahyun Jung , Jaehyung Seo , Jaewook Lee , Chanjun Park , Heuiseok Lim

Open Korean Corpora: A Practical Report

Korean is often referred to as a low-resource language in the research community. While this claim is partially true, it is also because the availability of resources is inadequately advertised and curated. This work curates and reviews a…

Computation and Language · Computer Science 2023-05-17 Won Ik Cho , Sangwhan Moon , Youngsook Song