English
Related papers

Related papers: Automated Harmfulness Testing for Code Large Langu…

200 papers

The prevalence of harmful content on social media platforms poses significant risks to users and society, necessitating more effective and scalable content moderation strategies. Current approaches rely on human moderators, supervised…

Computation and Language · Computer Science 2025-01-27 Akash Bonagiri , Lucen Li , Rajvardhan Oak , Zeerak Babar , Magdalena Wojcieszak , Anshuman Chhabra

Large Language Models (LLMs) have revolutionized content creation across digital platforms, offering unprecedented capabilities in natural language generation and understanding. These models enable beneficial applications such as content…

Computation and Language · Computer Science 2025-08-14 Chi Zhang , Changjia Zhu , Junjie Xiong , Xiaoran Xu , Lingyao Li , Yao Liu , Zhuo Lu

Recent research has focused on using large language models (LLMs) to generate explanations for hate speech through fine-tuning or prompting. Despite the growing interest in this area, these generated explanations' effectiveness and…

Computation and Language · Computer Science 2023-08-31 Han Wang , Ming Shan Hee , Md Rabiul Awal , Kenny Tsu Wei Choo , Roy Ka-Wei Lee

The widespread of generative artificial intelligence has heightened concerns about the potential harms posed by AI-generated texts, primarily stemming from factoid, unfair, and toxic content. Previous researchers have invested much effort…

Computation and Language · Computer Science 2024-12-24 Shiyao Cui , Zhenyu Zhang , Yilong Chen , Wenyuan Zhang , Tianyun Liu , Siqi Wang , Tingwen Liu

Large language models (LLMs) have become integral to various real-world applications, leveraging massive, web-sourced datasets like Common Crawl, C4, and FineWeb for pretraining. While these datasets provide linguistic data essential for…

Computation and Language · Computer Science 2025-08-14 Sai Krishna Mendu , Harish Yenala , Aditi Gulati , Shanu Kumar , Parag Agrawal

Large language models (LLMs) have become ubiquitous, thus it is important to understand their risks and limitations. Smaller LLMs can be deployed where compute resources are constrained, such as edge devices, but with different propensity…

Computation and Language · Computer Science 2025-04-22 Berk Atil , Vipul Gupta , Sarkar Snigdha Sarathi Das , Rebecca J. Passonneau

Social media platforms utilize Machine Learning (ML) and Artificial Intelligence (AI) powered recommendation algorithms to maximize user engagement, which can result in inadvertent exposure to harmful content. Current moderation efforts,…

Computation and Language · Computer Science 2025-05-30 Rajvardhan Oak , Muhammad Haroon , Claire Jo , Magdalena Wojcieszak , Anshuman Chhabra

Code metamorphism refers to a computer programming exercise wherein the program modifies its own code (partial or entire) consistently and automatically while retaining its core functionality. This technique is often used for online…

Cryptography and Security · Computer Science 2024-11-05 Pooria Madani

Large Language Models (LLMs) have become powerful tools for automated code generation. However, these models often overlook critical security practices, which can result in the generation of insecure code that contains…

Software Engineering · Computer Science 2025-07-01 Hao Yan , Swapneel Suhas Vaidya , Xiaokuan Zhang , Ziyu Yao

In this paper, we explore the feasibility of leveraging large language models (LLMs) to automate or otherwise assist human raters with identifying harmful content including hate speech, harassment, violent extremism, and election…

Large language models (LLMs) are increasingly being used for emotional support. They are also being developed for formal therapy purposes. However, LLMs like ChaptGPT or Llama are often developed with content moderation guardrails that…

Human-Computer Interaction · Computer Science 2026-05-26 Jiwon Kim , Claire Wang , Taeung Yoon , Sabelle Huang , Koustuv Saha

The latest paradigm shift in software development brings in the innovation and automation afforded by Large Language Models (LLMs), showcased by Generative Pre-trained Transformer (GPT), which has shown remarkable capacity to generate code…

Software Engineering · Computer Science 2024-06-12 Xiaoyin Wang , Dakai Zhu

Sensitive information detection is crucial in content moderation to maintain safe online communities. Assisting in this traditionally manual process could relieve human moderators from overwhelming and tedious tasks, allowing them to focus…

The volume of machine-generated content online has grown dramatically due to the widespread use of Large Language Models (LLMs), leading to new challenges for content moderation systems. Conventional content moderation classifiers, which…

Computation and Language · Computer Science 2026-05-26 Shaz Furniturewala , Arkaitz Zubiaga

Large Language Models (LLMs) have been shown to generate harmful content. However, the underlying causes of such behavior remain under explored. We propose a causal mediation analysis-based approach to identify the causal factors…

Artificial Intelligence · Computer Science 2026-04-14 Rajesh Ganguli , Raha Moraffah

Large language models (LLMs) have become ubiquitous, interfacing with humans in numerous safety-critical applications. This necessitates improving capabilities, but importantly coupled with greater safety measures to align these models with…

Computation and Language · Computer Science 2025-06-26 Ujwal Narayan , Shreyas Chaudhari , Ashwin Kalyan , Tanmay Rajpurohit , Karthik Narasimhan , Ameet Deshpande , Vishvak Murahari

The exponential growth of social media platforms such as Twitter and Facebook has revolutionized textual communication and textual content publication in human society. However, they have been increasingly exploited to propagate toxic…

Computation and Language · Computer Science 2023-02-14 Wenxuan Wang , Jen-tse Huang , Weibin Wu , Jianping Zhang , Yizhan Huang , Shuqing Li , Pinjia He , Michael Lyu

Recent advances in large language models (LLMs) have demonstrated strong performance on simple text classification tasks, frequently under zero-shot settings. However, their efficacy declines when tackling complex social media challenges…

Computation and Language · Computer Science 2025-04-23 Elyas Meguellati , Assaad Zeghina , Shazia Sadiq , Gianluca Demartini

Large language models (LLMs) have shown great potential as general-purpose AI assistants in various domains. To meet the requirements of different applications, LLMs are often customized by further fine-tuning. However, the powerful…

Machine Learning · Computer Science 2023-11-07 Xin Zhou , Yi Lu , Ruotian Ma , Tao Gui , Qi Zhang , Xuanjing Huang

The growth of online platforms and user content requires strong content moderation systems that can handle complex inputs from various media types. While large language models (LLMs) are effective, their high computational cost and latency…

Computation and Language · Computer Science 2026-04-09 Shutong Zhang , Dylan Zhou , Yinxiao Liu , Yang Yang , Huiwen Luo , Wenfei Zou
‹ Prev 1 2 3 10 Next ›