Related papers: Context-Sensitive Malicious Spelling Error Correct…

Context-sensitive Spelling Correction Using Google Web 1T 5-Gram Information

In computing, spell checking is the process of detecting and sometimes providing spelling suggestions for incorrectly spelled words in a text. Basically, a spell checker is a computer program that uses a dictionary of words to perform spell…

Computation and Language · Computer Science 2012-04-27 Youssef Bassil , Mohammad Alwani

Misspelling Correction with Pre-trained Contextual Language Model

Spelling irregularities, known now as spelling mistakes, have been found for several centuries. As humans, we are able to understand most of the misspelled words based on their location in the sentence, perceived pronunciation, and context.…

Computation and Language · Computer Science 2021-01-12 Yifei Hu , Xiaonan Jing , Youlim Ko , Julia Taylor Rayz

CATBERT: Context-Aware Tiny BERT for Detecting Social Engineering Emails

Targeted phishing emails are on the rise and facilitate the theft of billions of dollars from organizations a year. While malicious signals from attached files or malicious URLs in emails can be detected by conventional malware signatures…

Cryptography and Security · Computer Science 2020-10-08 Younghoo Lee , Joshua Saxe , Richard Harang

Context-aware Stand-alone Neural Spelling Correction

Existing natural language processing systems are vulnerable to noisy inputs resulting from misspellings. On the contrary, humans can easily infer the corresponding correct words from their misspellings and surrounding context. Inspired by…

Computation and Language · Computer Science 2025-05-19 Xiangci Li , Hairong Liu , Liang Huang

Misspellings in Natural Language Processing: A survey

This survey provides an overview of the challenges of misspellings in natural language processing (NLP). While often unintentional, misspellings have become ubiquitous in digital communication, especially with the proliferation of Web 2.0,…

Computation and Language · Computer Science 2025-10-27 Gianluca Sperduti , Alejandro Moreo

Contextual Multilingual Spellchecker for User Queries

Spellchecking is one of the most fundamental and widely used search features. Correcting incorrectly spelled user queries not only enhances the user experience but is expected by the user. However, most widely available spellchecking…

Computation and Language · Computer Science 2024-04-16 Sanat Sharma , Josep Valls-Vargas , Tracy Holloway King , Francois Guerin , Chirag Arora

Abuse is Contextual, What about NLP? The Role of Context in Abusive Language Annotation and Detection

The datasets most widely used for abusive language detection contain lists of messages, usually tweets, that have been manually judged as abusive or not by one or more annotators, with the annotation performed at message level. In this…

Computation and Language · Computer Science 2021-03-30 Stefano Menini , Alessio Palmero Aprosio , Sara Tonelli

Domain specificity and data efficiency in typo tolerant spell checkers: the case of search in online marketplaces

Typographical errors are a major source of frustration for visitors of online marketplaces. Because of the domain-specific nature of these marketplaces and the very short queries users tend to search for, traditional spell cheking solutions…

Machine Learning · Computer Science 2023-08-07 Dayananda Ubrangala , Juhi Sharma , Ravi Prasad Kondapalli , Kiran R , Amit Agarwala , Laurent Boué

A context sensitive real-time Spell Checker with language adaptability

We present a novel language adaptable spell checking system which detects spelling errors and suggests context sensitive corrections in real-time. We show that our system can be extended to new languages with minimal language-specific…

Computation and Language · Computer Science 2019-10-25 Prabhakar Gupta

Machine Learning Driven Smishing Detection Framework for Mobile Security

The increasing reliance on smartphones for communication, financial transactions, and personal data management has made them prime targets for cyberattacks, particularly smishing, a sophisticated variant of phishing conducted via SMS.…

Cryptography and Security · Computer Science 2024-12-16 Diksha Goel , Hussain Ahmad , Ankit Kumar Jain , Nikhil Kumar Goel

Misspelling Oblivious Word Embeddings

In this paper we present a method to learn word embeddings that are resilient to misspellings. Existing word embeddings have limited applicability to malformed texts, which contain a non-negligible amount of out-of-vocabulary words. We…

Computation and Language · Computer Science 2019-05-24 Bora Edizel , Aleksandra Piktus , Piotr Bojanowski , Rui Ferreira , Edouard Grave , Fabrizio Silvestri

Using Lexical Features for Malicious URL Detection -- A Machine Learning Approach

Malicious websites are responsible for a majority of the cyber-attacks and scams today. Malicious URLs are delivered to unsuspecting users via email, text messages, pop-ups or advertisements. Clicking on or crawling such URLs can result in…

Cryptography and Security · Computer Science 2019-10-15 Apoorva Joshi , Levi Lloyd , Paul Westin , Srini Seethapathy

Enriching Abusive Language Detection with Community Context

Uses of pejorative expressions can be benign or actively empowering. When models for abuse detection misclassify these expressions as derogatory, they inadvertently censor productive conversations held by marginalized groups. One way to…

Computation and Language · Computer Science 2022-06-20 Jana Kurrek , Haji Mohammad Saleem , Derek Ruths

AMSI-Based Detection of Malicious PowerShell Code Using Contextual Embeddings

PowerShell is a command-line shell, supporting a scripting language. It is widely used in organizations for configuration management and task automation but is also increasingly used by cybercriminals for launching cyberattacks against…

Cryptography and Security · Computer Science 2019-09-20 Amir Rubin , Shay Kels , Danny Hendler

Context Biasing for Pronunciation-Orthography Mismatch in Automatic Speech Recognition

Neural sequence-to-sequence systems deliver state-of-the-art performance for automatic speech recognition. When using appropriate modeling units, e.g., byte-pair encoding, these systems are in principle open vocabulary systems. In practice,…

Computation and Language · Computer Science 2026-03-05 Christian Huber , Alexander Waibel

Cross-Lingual Contextual Word Embeddings Mapping With Multi-Sense Words In Mind

Recent work in cross-lingual contextual word embedding learning cannot handle multi-sense words well. In this work, we explore the characteristics of contextual word embeddings and show the link between contextual word embeddings and word…

Computation and Language · Computer Science 2019-09-20 Zheng Zhang , Ruiqing Yin , Jun Zhu , Pierre Zweigenbaum

A Bayesian hybrid method for context-sensitive spelling correction

Two classes of methods have been shown to be useful for resolving lexical ambiguity. The first relies on the presence of particular words within some distance of the ambiguous target word; the second uses the pattern of words and…

cmp-lg · Computer Science 2008-02-03 Andrew R. Golding

Applying Winnow to Context-Sensitive Spelling Correction

Multiplicative weight-updating algorithms such as Winnow have been studied extensively in the COLT literature, but only recently have people started to use them in applications. In this paper, we apply a Winnow-based algorithm to a task in…

cmp-lg · Computer Science 2008-02-03 Andrew R. Golding , Dan Roth

Selective Demonstration Retrieval for Improved Implicit Hate Speech Detection

Hate speech detection is a crucial area of research in natural language processing, essential for ensuring online community safety. However, detecting implicit hate speech, where harmful intent is conveyed in subtle or indirect ways,…

Computation and Language · Computer Science 2025-04-17 Yumin Kim , Hwanhee Lee

Combining Trigram-based and Feature-based Methods for Context-Sensitive Spelling Correction

This paper addresses the problem of correcting spelling errors that result in valid, though unintended words (such as ``peace'' and ``piece'', or ``quiet'' and ``quite'') and also the problem of correcting particular word usage errors (such…

cmp-lg · Computer Science 2008-02-03 Andrew R. Golding , Yves Schabes