Related papers: Automatic Textual Normalization for Hate Speech De…

Hate Speech Detection on Vietnamese Social Media Text using the Bi-GRU-LSTM-CNN Model

In recent years, Hate Speech Detection has become one of the interesting fields in natural language processing or computational linguistics. In this paper, we present the description of our system to solve this problem at the VLSP shared…

Computation and Language · Computer Science 2019-12-24 Tin Van Huynh , Vu Duc Nguyen , Kiet Van Nguyen , Ngan Luu-Thuy Nguyen , Anh Gia-Tuan Nguyen

Non-Standard Vietnamese Word Detection and Normalization for Text-to-Speech

Converting written texts into their spoken forms is an essential problem in any text-to-speech (TTS) systems. However, building an effective text normalization solution for a real-world TTS system face two main challenges: (1) the semantic…

Computation and Language · Computer Science 2022-09-08 Huu-Tien Dang , Thi-Hai-Yen Vuong , Xuan-Hieu Phan

ViTHSD: Exploiting Hatred by Targets for Hate Speech Detection on Vietnamese Social Media Texts

The growth of social networks makes toxic content spread rapidly. Hate speech detection is a task to help decrease the number of harmful comments. With the diversity in the hate speech created by users, it is necessary to interpret the hate…

Computation and Language · Computer Science 2025-02-11 Cuong Nhat Vo , Khanh Bao Huynh , Son T. Luu , Trong-Hop Do

Hate Speech Detection on Vietnamese Social Media Text using the Bidirectional-LSTM Model

In this paper, we describe our system which participates in the shared task of Hate Speech Detection on Social Networks of VLSP 2019 evaluation campaign. We are provided with the pre-labeled dataset and an unlabeled dataset for social media…

Computation and Language · Computer Science 2019-11-12 Hang Thi-Thuy Do , Huy Duc Huynh , Kiet Van Nguyen , Ngan Luu-Thuy Nguyen , Anh Gia-Tuan Nguyen

Comparison Between Traditional Machine Learning Models And Neural Network Models For Vietnamese Hate Speech Detection

Hate-speech detection on social network language has become one of the main researching fields recently due to the spreading of social networks like Facebook and Twitter. In Vietnam, the threat of offensive and harassment cause bad impacts…

Computation and Language · Computer Science 2020-09-29 Son T. Luu , Hung P. Nguyen , Kiet Van Nguyen , Ngan Luu-Thuy Nguyen

A Weakly Supervised Data Labeling Framework for Machine Lexical Normalization in Vietnamese Social Media

This study introduces an innovative automatic labeling framework to address the challenges of lexical normalization in social media texts for low-resource languages like Vietnamese. Social media data is rich and diverse, but the evolving…

Computation and Language · Computer Science 2024-10-01 Dung Ha Nguyen , Anh Thi Hoang Nguyen , Kiet Van Nguyen

Leveraging Intra-User and Inter-User Representation Learning for Automated Hate Speech Detection

Hate speech detection is a critical, yet challenging problem in Natural Language Processing (NLP). Despite the existence of numerous studies dedicated to the development of NLP hate speech detection approaches, the accuracy is still poor.…

Computation and Language · Computer Science 2018-09-17 Jing Qian , Mai ElSherief , Elizabeth M. Belding , William Yang Wang

Adapting Sequence to Sequence models for Text Normalization in Social Media

Social media offer an abundant source of valuable raw data, however informal writing can quickly become a bottleneck for many natural language processing (NLP) tasks. Off-the-shelf tools are usually trained on formal text and cannot…

Computation and Language · Computer Science 2019-04-15 Ismini Lourentzou , Kabir Manghnani , ChengXiang Zhai

Sequence-to-Sequence Lexical Normalization with Multilingual Transformers

Current benchmark tasks for natural language processing contain text that is qualitatively different from the text used in informal day to day digital communication. This discrepancy has led to severe performance degradation of…

Computation and Language · Computer Science 2021-10-13 Ana-Maria Bucur , Adrian Cosma , Liviu P. Dinu

ViLexNorm: A Lexical Normalization Corpus for Vietnamese Social Media Text

Lexical normalization, a fundamental task in Natural Language Processing (NLP), involves the transformation of words into their canonical forms. This process has been proven to benefit various downstream NLP tasks greatly. In this work, we…

Computation and Language · Computer Science 2024-02-01 Thanh-Nhi Nguyen , Thanh-Phong Le , Kiet Van Nguyen

VAIS Hate Speech Detection System: A Deep Learning based Approach for System Combination

Nowadays, Social network sites (SNSs) such as Facebook, Twitter are common places where people show their opinions, sentiments and share information with others. However, some people use SNSs to post abuse and harassment threats in order to…

Computation and Language · Computer Science 2019-10-15 Thai Binh Nguyen , Quang Minh Nguyen , Thu Hien Nguyen , Ngoc Phuong Pham , The Loc Nguyen , Quoc Truong Do

Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations

With the ever-increasing cases of hate spread on social media platforms, it is critical to design abuse detection mechanisms to proactively avoid and control such incidents. While there exist methods for hate speech detection, they…

Computation and Language · Computer Science 2020-01-17 Pinkesh Badjatiya , Manish Gupta , Vasudeva Varma

A Large-scale Dataset for Hate Speech Detection on Vietnamese Social Media Texts

In recent years, Vietnam witnesses the mass development of social network users on different social platforms such as Facebook, Youtube, Instagram, and Tiktok. On social medias, hate speech has become a critical problem for social network…

Computation and Language · Computer Science 2021-07-21 Son T. Luu , Kiet Van Nguyen , Ngan Luu-Thuy Nguyen

SWE2: SubWord Enriched and Significant Word Emphasized Framework for Hate Speech Detection

Hate speech detection on online social networks has become one of the emerging hot topics in recent years. With the broad spread and fast propagation speed across online social networks, hate speech makes significant impacts on society by…

Computation and Language · Computer Science 2024-09-26 Guanyi Mou , Pengyi Ye , Kyumin Lee

ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model

Recent advancements in hate speech detection (HSD) in Vietnamese have made significant progress, primarily attributed to the emergence of transformer-based pre-trained language models, particularly those built on the BERT architecture.…

Computation and Language · Computer Science 2024-06-05 Luan Thanh Nguyen

Enhancing Hate Speech Detection on Social Media: A Comparative Analysis of Machine Learning Models and Text Transformation Approaches

The proliferation of hate speech on social media platforms has necessitated the development of effective detection and moderation tools. This study evaluates the efficacy of various machine learning models in identifying hate speech and…

Computation and Language · Computer Science 2026-02-25 Saurabh Mishra , Shivani Thakur , Radhika Mamidi

Normalization of Transliterated Words in Code-Mixed Data Using Seq2Seq Model & Levenshtein Distance

Building tools for code-mixed data is rapidly gaining popularity in the NLP research community as such data is exponentially rising on social media. Working with code-mixed data contains several challenges, especially due to grammatical…

Computation and Language · Computer Science 2018-05-23 Soumil Mandal , Karthick Nanmaran

Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation

The automatic detection of hate speech online is an active research area in NLP. Most of the studies to date are based on social media datasets that contribute to the creation of hate speech detection models trained on them. However, data…

Computation and Language · Computer Science 2023-07-06 Dimosthenis Antypas , Jose Camacho-Collados

A Target-Aware Analysis of Data Augmentation for Hate Speech Detection

Hate speech is one of the main threats posed by the widespread use of social networks, despite efforts to limit it. Although attention has been devoted to this issue, the lack of datasets and case studies centered around scarcely…

Computation and Language · Computer Science 2024-10-11 Camilla Casula , Sara Tonelli

Analysis and Detection of Multilingual Hate Speech Using Transformer Based Deep Learning

Hate speech is harmful content that directly attacks or promotes hatred against members of groups or individuals based on actual or perceived aspects of identity, such as racism, religion, or sexual orientation. This can affect social life…

Computation and Language · Computer Science 2024-03-19 Arijit Das , Somashree Nandy , Rupam Saha , Srijan Das , Diganta Saha