Related papers: Interpretable Predictability-Based AI Text Detecti…

Overview of AuTexTification at IberLEF 2023: Detection and Attribution of Machine-Generated Text in Multiple Domains

This paper presents the overview of the AuTexTification shared task as part of the IberLEF 2023 Workshop in Iberian Languages Evaluation Forum, within the framework of the SEPLN 2023 conference. AuTexTification consists of two subtasks: for…

Computation and Language · Computer Science 2024-01-10 Areg Mikael Sarvazyan , José Ángel González , Marc Franco-Salvador , Francisco Rangel , Berta Chulvi , Paolo Rosso

Classification of Human- and AI-Generated Texts for English, French, German, and Spanish

In this paper we analyze features to classify human- and AI-generated text for English, French, German and Spanish and compare them across languages. We investigate two scenarios: (1) The detection of text generated by AI from scratch, and…

Computation and Language · Computer Science 2024-01-31 Kristina Schaaff , Tim Schlippe , Lorenz Mindner

UPB at IberLEF-2023 AuTexTification: Detection of Machine-Generated Text using Transformer Ensembles

This paper describes the solutions submitted by the UPB team to the AuTexTification shared task, featured as part of IberLEF-2023. Our team participated in the first subtask, identifying text documents produced by large language models…

Computation and Language · Computer Science 2023-08-04 Andrei-Alexandru Preda , Dumitru-Clementin Cercel , Traian Rebedea , Costin-Gabriel Chiru

A Comprehensive Framework for Semantic Similarity Analysis of Human and AI-Generated Text Using Transformer Architectures and Ensemble Techniques

The rapid advancement of large language models (LLMs) has made detecting AI-generated text an increasingly critical challenge. Traditional methods often fail to capture the nuanced semantic differences between human and machine-generated…

Computation and Language · Computer Science 2025-02-03 Lifu Gao , Ziwei Liu , Qi Zhang

Advancing LLM detection in the ALTA 2024 Shared Task: Techniques and Analysis

The recent proliferation of AI-generated content has prompted significant interest in developing reliable detection methods. This study explores techniques for identifying AI-generated text through sentence-level evaluation within hybrid…

Computation and Language · Computer Science 2024-12-30 Dima Galat

Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text

Detecting Machine-Generated Text (MGT) has emerged as a significant area of study within Natural Language Processing. While language models generate text, they often leave discernible traces, which can be scrutinized using either…

Computation and Language · Computer Science 2025-02-19 Seyedeh Fatemeh Ebrahimi , Karim Akhavan Azari , Amirmasoud Iravani , Arian Qazvini , Pouya Sadeghi , Zeinab Sadat Taghavi , Hossein Sameti

On the Generalization and Adaptation Ability of Machine-Generated Text Detectors in Academic Writing

The rising popularity of large language models (LLMs) has raised concerns about machine-generated text (MGT), particularly in academic settings, where issues like plagiarism and misinformation are prevalent. As a result, developing a highly…

Artificial Intelligence · Computer Science 2025-08-05 Yule Liu , Zhiyuan Zhong , Yifan Liao , Zhen Sun , Jingyi Zheng , Jiaheng Wei , Qingyuan Gong , Fenghua Tong , Yang Chen , Yang Zhang , Xinlei He

Classification of Human- and AI-Generated Texts: Investigating Features for ChatGPT

Recently, generative AIs like ChatGPT have become available to the wide public. These tools can for instance be used by students to generate essays or whole theses. But how does a teacher know whether a text is written by a student or an…

Computation and Language · Computer Science 2023-11-14 Lorenz Mindner , Tim Schlippe , Kristina Schaaff

Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection

Large Language Models (LLMs) have revolutionized text generation, making detecting machine-generated text increasingly challenging. Although past methods have achieved good performance on detecting pure machine-generated text, those…

Computation and Language · Computer Science 2024-12-24 Jiaqi Chen , Xiaoye Zhu , Tianyang Liu , Ying Chen , Xinhui Chen , Yiwen Yuan , Chak Tou Leong , Zuchao Li , Tang Long , Lei Zhang , Chenyu Yan , Guanghao Mei , Jie Zhang , Lefei Zhang

HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus

ChatGPT has garnered significant interest due to its impressive performance; however, there is growing concern about its potential risks, particularly in the detection of AI-generated content (AIGC), which is often challenging for untrained…

Computation and Language · Computer Science 2024-10-10 Zhenpeng Su , Xing Wu , Wei Zhou , Guangyuan Ma , Songlin Hu

Why AI-Generated Text Detection Fails: Evidence from Explainable AI Beyond Benchmark Accuracy

The widespread adoption of Large Language Models (LLMs) has made the detection of AI-Generated text a pressing and complex challenge. Although many detection systems report high benchmark accuracy, their reliability in real-world settings…

Computation and Language · Computer Science 2026-04-23 Shushanta Pudasaini , Luis Miralles-Pechuán , David Lillis , Marisa Llorens Salvador

Spotlights and Blindspots: Evaluating Machine-Generated Text Detection

With the rise of generative language models, machine-generated text detection has become a critical challenge. A wide variety of models is available, but inconsistent datasets, evaluation metrics, and assessment strategies obscure…

Computation and Language · Computer Science 2026-04-23 Kevin Stowe , Kailash Patil

A Multiplicative Model for Learning Distributed Text-Based Attribute Representations

In this paper we propose a general framework for learning distributed representations of attributes: characteristics of text whose representations can be jointly learned with word embeddings. Attributes can correspond to document indicators…

Machine Learning · Computer Science 2014-06-12 Ryan Kiros , Richard S. Zemel , Ruslan Salakhutdinov

Learning Easily Updated General Purpose Text Representations with Adaptable Task-Specific Prefixes

Many real-world applications require making multiple predictions from the same text. Fine-tuning a large pre-trained language model for each downstream task causes computational burdens in the inference time due to several times of forward…

Computation and Language · Computer Science 2023-10-17 Kuan-Hao Huang , Liang Tan , Rui Hou , Sinong Wang , Amjad Almahairi , Ruty Rinott

RFBES at SemEval-2024 Task 8: Investigating Syntactic and Semantic Features for Distinguishing AI-Generated and Human-Written Texts

Nowadays, the usage of Large Language Models (LLMs) has increased, and LLMs have been used to generate texts in different languages and for different tasks. Additionally, due to the participation of remarkable companies such as Google and…

Computation and Language · Computer Science 2024-02-26 Mohammad Heydari Rad , Farhan Farsi , Shayan Bali , Romina Etezadi , Mehrnoush Shamsfard

Boosting the Performance of Transformer Architectures for Semantic Textual Similarity

Semantic textual similarity is the task of estimating the similarity between the meaning of two texts. In this paper, we fine-tune transformer architectures for semantic textual similarity on the Semantic Textual Similarity Benchmark by…

Computation and Language · Computer Science 2023-06-02 Ivan Rep , Vladimir Čeperić

Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text

The development of Generative AI Large Language Models (LLMs) raised the alarm regarding identifying content produced through generative AI or humans. In one case, issues arise when students heavily rely on such tools in a manner that can…

Computation and Language · Computer Science 2025-01-07 Ayat Najjar , Huthaifa I. Ashqar , Omar Darwish , Eman Hammad

Feature-Augmented Transformers for Robust AI-Text Detection Across Domains and Generators

AI-generated text is nowadays produced at scale across domains and heterogeneous generation pipelines, making robustness to distribution shift a central requirement for supervised binary detectors. We train transformer-based detectors on…

Computation and Language · Computer Science 2026-05-06 Mohamed Mady , Johannes Reschke , Björn Schuller

Reframing Human-AI Collaboration for Generating Free-Text Explanations

Large language models are increasingly capable of generating fluent-appearing text with relatively little task-specific supervision. But can these models accurately explain classification decisions? We consider the task of generating…

Computation and Language · Computer Science 2022-05-06 Sarah Wiegreffe , Jack Hessel , Swabha Swayamdipta , Mark Riedl , Yejin Choi

AI-generated Text Detection: A Multifaceted Approach to Binary and Multiclass Classification

Large Language Models (LLMs) have demonstrated remarkable capabilities in generating text that closely resembles human writing across a wide range of styles and genres. However, such capabilities are prone to potential misuse, such as fake…

Computation and Language · Computer Science 2025-05-20 Harika Abburi , Sanmitra Bhattacharya , Edward Bowen , Nirmala Pudota