Related papers: Numerically Grounded Language Models for Semantic …
Assisted text input techniques can save time and effort and improve text quality. In this paper, we investigate how grounded and conditional extensions to standard neural language models can bring improvements in the tasks of word…
The paper presents an approach to semantic grounding of language models (LMs) that conceptualizes the LM as a conditional model generating text given a desired semantic message formalized as a set of entity-relationship triples. It embeds…
We perform neural machine translation of sentence fragments in order to create large amounts of training data for English grammatical error correction. Our method aims at simulating mistakes made by second language learners, and produces a…
Representational Similarity Analysis is a method from cognitive neuroscience, which helps in comparing representations from two different sources of data. In this paper, we propose using Representational Similarity Analysis to probe the…
Phrase-based statistical machine translation (SMT) systems have previously been used for the task of grammatical error correction (GEC) to achieve state-of-the-art accuracy. The superiority of SMT systems comes from their ability to learn…
Grammatical Error Correction (GEC) should not focus only on high accuracy of corrections but also on interpretability for language learning. However, existing neural-based GEC models mainly aim at improving accuracy, and their…
The vocabulary mismatch problem is a long-standing problem in information retrieval. Semantic matching holds the promise of solving the problem. Recent advances in language technology have given rise to unsupervised neural models for…
Automatic spelling and grammatical correction systems are one of the most widely used tools within natural language applications. In this thesis, we assume the task of error correction as a type of monolingual machine translation where the…
Recent work on Grammatical Error Correction (GEC) has highlighted the importance of language modeling in that it is certainly possible to achieve good performance by comparing the probabilities of the proposed edits. At the same time,…
We propose a nested recurrent neural network (nested RNN) model for English spelling error correction and generate pseudo data based on phonetic similarity to train it. The model fuses orthographic information and context as a whole and is…
Spelling irregularities, known now as spelling mistakes, have been found for several centuries. As humans, we are able to understand most of the misspelled words based on their location in the sentence, perceived pronunciation, and context.…
In this paper, we explore the capacity of a language model-based method for grammatical error detection in detail. We first show that 5 to 10% of training data are enough for a BERT-based error detection method to achieve performance…
In this paper, we introduce a contextual grounding approach that captures the context in corresponding text entities and image regions to improve the grounding accuracy. Specifically, the proposed architecture accepts pre-trained text token…
We treat grammatical error correction (GEC) as a classification problem in this study, where for different types of errors, a target word is identified, and the classifier predicts the correct word form from a set of possible choices. We…
Language models have demonstrated remarkable performance in solving reasoning tasks; however, even the strongest models still occasionally make reasoning mistakes. Recently, there has been active research aimed at improving reasoning…
Large Vision-Language Models (LVLMs) offer remarkable benefits for a variety of vision-language tasks. However, a challenge hindering their application in real-world scenarios, particularly regarding safety, robustness, and reliability, is…
Grounded claim factuality checking is important for large language model (LLM) applications such as retrieval-augmented generation, as it helps users assess the correctness of generated outputs. Existing metrics using entailment classifiers…
Language Models (LMs) have become widely used in software engineering, especially for tasks such as code generation, where they are referred to as code LMs. These models have proven effective in generating code, making it easier for…
Identifying and correcting grammatical errors in the text written by non-native writers has received increasing attention in recent years. Although a number of annotated corpora have been established to facilitate data-driven grammatical…
Off-the-shelf pre-trained language models have become the de facto standard in NLP pipelines for a multitude of downstream tasks. However, the inability of these models to properly encode numerals limits their performance on tasks requiring…