English
Related papers

Related papers: Toxicity Classification in Ukrainian

200 papers

Existing studies have investigated the tendency of autoregressive language models to generate contexts that exhibit undesired biases and toxicity. Various debiasing approaches have been proposed, which are primarily categorized into…

Computation and Language · Computer Science 2022-05-03 Yoon A Park , Frank Rudzicz

We introduce the first study of automatic detoxification of Russian texts to combat offensive language. Such a kind of textual style transfer can be used, for instance, for processing toxic content in social media. While much work has been…

Computation and Language · Computer Science 2021-05-20 Daryna Dementieva , Daniil Moskovskiy , Varvara Logacheva , David Dale , Olga Kozlova , Nikita Semenov , Alexander Panchenko

Recent NLP literature pays little attention to the robustness of toxicity language predictors, while these systems are most likely to be used in adversarial contexts. This paper presents a novel adversarial attack, \texttt{ToxicTrap},…

Computation and Language · Computer Science 2024-04-16 Dmitriy Bespalov , Sourav Bhabesh , Yi Xiang , Liutong Zhou , Yanjun Qi

Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise toxic language which hinders their safe deployment. We investigate the extent to which pretrained LMs can be prompted to generate toxic language,…

Computation and Language · Computer Science 2020-09-29 Samuel Gehman , Suchin Gururangan , Maarten Sap , Yejin Choi , Noah A. Smith

The reliability of multilingual Large Language Model (LLM) evaluation is currently compromised by the inconsistent quality of translated benchmarks. Existing resources often suffer from semantic drift and context loss, which can lead to…

Computation and Language · Computer Science 2026-02-26 Hanna Yukhymenko , Anton Alexandrov , Martin Vechev

Large Language Models (LLMs) are powerful text generators, yet they can produce toxic or harmful content even when given seemingly harmless prompts. This presents a serious safety challenge and can cause real-world harm. Toxicity is often…

Computation and Language · Computer Science 2026-02-09 Himanshu Singh , Ziwei Xu , A. V. Subramanyam , Mohan Kankanhalli

Background: The existence of toxic conversations in open-source platforms can degrade relationships among software developers and may negatively impact software product quality. To help mitigate this, some initial work has been done to…

Software Engineering · Computer Science 2023-07-10 Jaydeb Saker , Sayma Sultana , Steven R. Wilson , Amiangshu Bosu

Document image classification is different from plain-text document classification and consists of classifying a document by understanding the content and structure of documents such as forms, emails, and other such documents. We show that…

Computation and Language · Computer Science 2023-10-26 Yoshinari Fujinuma , Siddharth Varia , Nishant Sankaran , Srikar Appalaraju , Bonan Min , Yogarshi Vyas

Classification is an essential and fundamental task in machine learning, playing a cardinal role in the field of natural language processing (NLP) and computer vision (CV). In a supervised learning setting, labels are always needed for the…

Computation and Language · Computer Science 2021-02-04 Irene Li

Text clustering serves as a fundamental technique for organizing and interpreting unstructured textual data, particularly in contexts where manual annotation is prohibitively costly. With the rapid advancement of Large Language Models…

Computation and Language · Computer Science 2025-10-08 Chen Huang , Guoxiu He

Data contamination has garnered increased attention in the era of large language models (LLMs) due to the reliance on extensive internet-derived training corpora. The issue of training corpus overlap with evaluation benchmarks--referred to…

Computation and Language · Computer Science 2024-06-24 Chunyuan Deng , Yilun Zhao , Yuzhao Heng , Yitong Li , Jiannan Cao , Xiangru Tang , Arman Cohan

Ordinal Classification (OC) is a widely encountered challenge in Natural Language Processing (NLP), with applications in various domains such as sentiment analysis, rating prediction, and more. Previous approaches to tackle OC have…

Computation and Language · Computer Science 2024-05-21 Siva Rajesh Kasa , Aniket Goel , Karan Gupta , Sumegh Roychowdhury , Anish Bhanushali , Nikhil Pattisapu , Prasanna Srinivasa Murthy

Text classification is an important task in Natural Language Processing (NLP), where the goal is to categorize text data into predefined classes. In this study, we analyse the dataset creation steps and evaluation techniques of multi-label…

Computation and Language · Computer Science 2023-03-01 Elmurod Kuriyozov , Ulugbek Salaev , Sanatbek Matlatipov , Gayrat Matlatipov

Large Language Models (LLMs) are increasingly being integrated into various medical fields, including mental health support systems. However, there is a gap in research regarding the effectiveness of LLMs in non-English mental health…

Computation and Language · Computer Science 2026-02-10 Konstantinos Skianis , John Pavlopoulos , A. Seza Doğruöz

The full-scale conflict between the Russian Federation and Ukraine generated an unprecedented amount of news articles and social media data reflecting opposing ideologies and narratives. These polarized campaigns have led to mutual…

Computation and Language · Computer Science 2023-01-26 Veronika Solopova , Oana-Iuliana Popescu , Christoph Benzmüller , Tim Landgraf

Studies have shown that toxic behavior can cause contributors to leave, and hinder newcomers' (especially from underrepresented communities) participation in Open Source Software (OSS) projects. Thus, detection of toxic language plays a…

Software Engineering · Computer Science 2025-01-28 Ramtin Ehsani , Rezvaneh Rezapour , Preetha Chatterjee

Evaluating cross-lingual knowledge transfer in large language models is challenging, as correct answers in a target language may arise either from genuine transfer or from prior exposure during pre-training. We present LiveCLKTBench, an…

Computation and Language · Computer Science 2026-04-21 Pei-Fu Guo , Yun-Da Tsai , Chun-Chia Hsu , Kai-Xin Chen , Ya-An Tsai , Kai-Wei Chang , Nanyun Peng , Mi-Yen Yeh , Shou-De Lin

Analyzing texts such as open-ended responses, headlines, or social media posts is a time- and labor-intensive process highly susceptible to bias. LLMs are promising tools for text analysis, using either a predefined (top-down) or a…

Machine Learning (ML) is increasingly applied in real-life scenarios, raising concerns about bias in automatic decision making. We focus on bias as a notion of opinion exclusion, that stems from the direct application of traditional ML…

Machine Learning · Computer Science 2019-11-07 Agathe Balayn , Alessandro Bozzon

The spread of election misinformation and harmful political content conveys misleading narratives and poses a serious threat to democratic integrity. Detecting harmful content at early stages is essential for understanding and potentially…

Human-Computer Interaction · Computer Science 2026-02-24 Qile Wang , Prerana Khatiwada , Carolina Coimbra Vieira , Benjamin E. Bagozzi , Kenneth E. Barner , Matthew Louis Mauriello