Related papers: Towards standardizing Korean Grammatical Error Cor…

KoGEC : Korean Grammatical Error Correction with Pre-trained Translation Models

This research introduces KoGEC, a Korean Grammatical Error Correction system using pre\--trained translation models. We fine-tuned NLLB (No Language Left Behind) models for Korean GEC, comparing their performance against large language…

Computation and Language · Computer Science 2025-06-16 Taeeun Kim , Semin Jeong , Youngsook Song

A Simple Recipe for Multilingual Grammatical Error Correction

This paper presents a simple recipe to train state-of-the-art multilingual Grammatical Error Correction (GEC) models. We achieve this by first proposing a language-agnostic method to generate a large number of synthetic examples. The second…

Computation and Language · Computer Science 2022-08-10 Sascha Rothe , Jonathan Mallinson , Eric Malmi , Sebastian Krause , Aliaksei Severyn

ErAConD : Error Annotated Conversational Dialog Dataset for Grammatical Error Correction

Currently available grammatical error correction (GEC) datasets are compiled using well-formed written text, limiting the applicability of these datasets to other domains such as informal writing and dialog. In this paper, we present a…

Computation and Language · Computer Science 2025-08-27 Xun Yuan , Derek Pham , Sam Davidson , Zhou Yu

FlaCGEC: A Chinese Grammatical Error Correction Dataset with Fine-grained Linguistic Annotation

Chinese Grammatical Error Correction (CGEC) has been attracting growing attention from researchers recently. In spite of the fact that multiple CGEC datasets have been developed to support the research, these datasets lack the ability to…

Computation and Language · Computer Science 2023-11-10 Hanyue Du , Yike Zhao , Qingyuan Tian , Jiani Wang , Lei Wang , Yunshi Lan , Xuesong Lu

ChatLang-8: An LLM-Based Synthetic Data Generation Framework for Grammatical Error Correction

We explore and improve the capabilities of LLMs to generate data for grammatical error correction (GEC). When merely producing parallel sentences, their patterns are too simplistic to be valuable as a corpus. To address this issue, we…

Computation and Language · Computer Science 2024-06-12 Jeiyoon Park , Chanjun Park , Heuiseok Lim

Construction of a Quality Estimation Dataset for Automatic Evaluation of Japanese Grammatical Error Correction

In grammatical error correction (GEC), automatic evaluation is an important factor for research and development of GEC systems. Previous studies on automatic evaluation have demonstrated that quality estimation models built from datasets…

Computation and Language · Computer Science 2022-01-21 Daisuke Suzuki , Yujin Takahashi , Ikumi Yamashita , Taichi Aida , Tosho Hirasawa , Michitaka Nakatsuji , Masato Mita , Mamoru Komachi

Grammatical Error Correction in Low-Resource Scenarios

Grammatical error correction in English is a long studied problem with many existing systems and datasets. However, there has been only a limited research on error correction of other languages. In this paper, we present a new dataset…

Computation and Language · Computer Science 2019-10-17 Jakub Náplava , Milan Straka

Corrections Meet Explanations: A Unified Framework for Explainable Grammatical Error Correction

Grammatical Error Correction (GEC) faces a critical challenge concerning explainability, notably when GEC systems are designed for language learners. Existing research predominantly focuses on explaining grammatical errors extracted in…

Computation and Language · Computer Science 2025-02-24 Jingheng Ye , Shang Qin , Yinghui Li , Hai-Tao Zheng , Shen Wang , Qingsong Wen

Chinese Grammatical Error Correction: A Survey

Chinese Grammatical Error Correction (CGEC) is a critical task in Natural Language Processing, addressing the growing demand for automated writing assistance in both second-language (L2) and native (L1) Chinese writing. While L2 learners…

Computation and Language · Computer Science 2025-04-02 Mengyang Qiu , Qingyu Gao , Linxuan Yang , Yang Gu , Tran Minh Nguyen , Zihao Huang , Jungyeul Park

Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule

Progress in neural grammatical error correction (GEC) is hindered by the lack of annotated training data. Sufficient amounts of high-quality manually annotated data are not available, so recent research has relied on generating synthetic…

Computation and Language · Computer Science 2023-11-21 Andrey Bout , Alexander Podolskiy , Sergey Nikolenko , Irina Piontkovskaya

Synthetic Data Generation for Grammatical Error Correction with Tagged Corruption Models

Synthetic data generation is widely known to boost the accuracy of neural grammatical error correction (GEC) systems, but existing methods often lack diversity or are too simplistic to generate the broad range of grammatical errors made by…

Computation and Language · Computer Science 2021-05-28 Felix Stahlberg , Shankar Kumar

Enriching the Korean Learner Corpus with Multi-reference Annotations and Rubric-Based Scoring

Despite growing global interest in Korean language education, there remains a significant lack of learner corpora tailored to Korean L2 writing. To address this gap, we enhance the KoLLA Korean learner corpus by adding multiple grammatical…

Computation and Language · Computer Science 2025-05-02 Jayoung Song , KyungTae Lim , Jungyeul Park

Multi-head Sequence Tagging Model for Grammatical Error Correction

To solve the Grammatical Error Correction (GEC) problem , a mapping between a source sequence and a target one is needed, where the two differ only on few spans. For this reason, the attention has been shifted to the non-autoregressive or…

Computation and Language · Computer Science 2024-10-23 Kamal Al-Sabahi , Kang Yang , Wangwang Liu , Guanyu Jiang , Xian Li , Ming Yang

Grammatical Error Correction: A Survey of the State of the Art

Grammatical Error Correction (GEC) is the task of automatically detecting and correcting errors in text. The task not only includes the correction of grammatical errors, such as missing prepositions and mismatched subject-verb agreement,…

Computation and Language · Computer Science 2023-12-05 Christopher Bryant , Zheng Yuan , Muhammad Reza Qorib , Hannan Cao , Hwee Tou Ng , Ted Briscoe

FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction

Grammatical Error Correction (GEC) has been broadly applied in automatic correction and proofreading system recently. However, it is still immature in Chinese GEC due to limited high-quality data from native speakers in terms of category…

Computation and Language · Computer Science 2023-08-08 Lvxiaowei Xu , Jianwang Wu , Jiawei Peng , Jiayu Fu , Ming Cai

Grammatical Error Correction via Mixed-Grained Weighted Training

The task of Grammatical Error Correction (GEC) aims to automatically correct grammatical errors in natural texts. Almost all previous works treat annotated training data equally, but inherent discrepancies in data are neglected. In this…

Computation and Language · Computer Science 2023-11-27 Jiahao Li , Quan Wang , Chiwei Zhu , Zhendong Mao , Yongdong Zhang

A Comprehensive Survey of Grammar Error Correction

Grammar error correction (GEC) is an important application aspect of natural language processing techniques. The past decade has witnessed significant progress achieved in GEC for the sake of increasing popularity of machine learning and…

Computation and Language · Computer Science 2020-05-15 Yu Wang , Yuelin Wang , Jie Liu , Zhuo Liu

Towards the Development of Balanced Synthetic Data for Correcting Grammatical Errors in Arabic: An Approach Based on Error Tagging Model and Synthetic Data Generating Model

Synthetic data generation is widely recognized as a way to enhance the quality of neural grammatical error correction (GEC) systems. However, current approaches often lack diversity or are too simplistic to generate the wide range of…

Computation and Language · Computer Science 2025-02-11 Ahlam Alrehili , Areej Alhothali

Evaluation of large-scale synthetic data for Grammar Error Correction

Grammar Error Correction(GEC) mainly relies on the availability of high quality of large amount of synthetic parallel data of grammatically correct and erroneous sentence pairs. The quality of the synthetic data is evaluated on how well the…

Computation and Language · Computer Science 2022-11-01 Vanya Bannihatti Kumar

MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction

This paper presents MuCGEC, a multi-reference multi-source evaluation dataset for Chinese Grammatical Error Correction (CGEC), consisting of 7,063 sentences collected from three Chinese-as-a-Second-Language (CSL) learner sources. Each…

Computation and Language · Computer Science 2022-05-05 Yue Zhang , Zhenghua Li , Zuyi Bao , Jiacheng Li , Bo Zhang , Chen Li , Fei Huang , Min Zhang