English
Related papers

Related papers: Fine-Tuning Pre-Trained Code Models for AI-Generat…

200 papers

With the rapid growth of large language models for code generation, distinguishing between human-written and AI-generated code has become increasingly critical for academic integrity, hiring evaluations, and software security. We present…

Software Engineering · Computer Science 2026-05-01 Kargi Chauhan , Sadiba Nusrat Nur

Multi-domain detection of the machine-generated code snippets in various programming languages is a challenging task. SemEval-2026 Task~13 copes with this challenge in various angles, as a binary detection problem as well as attribution of…

Machine Learning · Computer Science 2026-04-24 Adam Skurla , Dominik Macko , Jakub Simko

Large Language Models (LLMs) have demonstrated remarkable capabilities in generating text that closely resembles human writing across a wide range of styles and genres. However, such capabilities are prone to potential misuse, such as fake…

Computation and Language · Computer Science 2025-05-20 Harika Abburi , Sanmitra Bhattacharya , Edward Bowen , Nirmala Pudota

SemEval-2026 Task 13 investigates machine-generated code detection across multiple programming languages and application scenarios, asking participating systems to generalize to unseen languages and domains. This paper describes our…

Computation and Language · Computer Science 2026-05-07 Elitsa Yotkova , Violeta Kastreva , Dimitar Dimitrov , Ivan Koychev , Preslav Nakov

SemEval-2024 Task 8 provides a challenge to detect human-written and machine-generated text. There are 3 subtasks for different detection scenarios. This paper proposes a system that mainly deals with Subtask B. It aims to detect if given…

Computation and Language · Computer Science 2024-04-02 Renhua Gu , Xiangfeng Meng

The paper describes a system designed by Advacheck team to recognise machine-generated and human-written texts in the monolingual subtask of GenAI Detection Task 1 competition. Our developed system is a multi-task architecture with shared…

Computation and Language · Computer Science 2024-11-19 German Gritsai , Anastasia Voznyuk , Ildar Khabutdinov , Andrey Grabovoy

SemEval-2024 Task 8 introduces the challenge of identifying machine-generated texts from diverse Large Language Models (LLMs) in various languages and domains. The task comprises three subtasks: binary classification in monolingual and…

Computation and Language · Computer Science 2024-01-24 Feng Xiong , Thanet Markchom , Ziwei Zheng , Subin Jung , Varun Ojha , Huizhi Liang

We present the results and the main findings of SemEval-2024 Task 8: Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection. The task featured three subtasks. Subtask A is a binary classification task determining…

The task of generating code solutions for a given programming problem can benefit from the use of pre-trained language models such as Codex, which can produce multiple diverse samples. However, a major challenge for this task is to select…

Computation and Language · Computer Science 2022-11-24 Bei Chen , Fengji Zhang , Anh Nguyen , Daoguang Zan , Zeqi Lin , Jian-Guang Lou , Weizhu Chen

The growing capability of large language models to produce fluent, contextually coherent text has created mounting pressure on the systems and institutions responsible for ensuring the authenticity of digital content. Advanced generative…

SemEval-2024 Task 8 is focused on multigenerator, multidomain, and multilingual black-box machine-generated text detection. Such a detection is important for preventing a potential misuse of large language models (LLMs), the newest of which…

Computation and Language · Computer Science 2024-06-18 Michal Spiegel , Dominik Macko

The growing collaboration between humans and AI models in generative tasks has introduced new challenges in distinguishing between human-written, LLM-generated, and human-LLM collaborative texts. In this work, we collect a multilingual,…

Computation and Language · Computer Science 2026-02-10 Minh Ngoc Ta , Dong Cao Van , Duc-Anh Hoang , Minh Le-Anh , Truong Nguyen , My Anh Tran Nguyen , Yuxia Wang , Preslav Nakov , Sang Dinh

Generation of Artificial Intelligence (AI) texts in important works has become a common practice that can be used to misuse and abuse AI at various levels. Traditional AI detectors often rely on document-level classification, which…

Computation and Language · Computer Science 2025-09-24 Lekkala Sai Teja , Annepaka Yadagiri , Partha Pakray , Chukhu Chunka , Mangadoddi Srikar Vardhan

Misogyny and sexism are growing problems in social media. Advances have been made in online sexism detection but the systems are often uninterpretable. SemEval-2023 Task 10 on Explainable Detection of Online Sexism aims at increasing…

Computation and Language · Computer Science 2023-06-09 Konstantin Chernyshev , Ekaterina Garanina , Duygu Bayram , Qiankun Zheng , Lukas Edman

Recent advancements in natural language processing \cite{gpt2} \cite{BERT} have led to near-human performance in multiple natural language tasks. In this paper, we seek to understand whether similar techniques can be applied to a highly…

Computation and Language · Computer Science 2021-02-23 Luis Perez , Lizi Ottens , Sudharshan Viswanathan

In this paper, we have worked on interpretability, trust, and understanding of the decisions made by models in the form of classification tasks. The task is divided into 3 subtasks. The first task consists of determining Binary Sexism…

Computation and Language · Computer Science 2023-04-11 Debashish Roy , Manish Shrivastava

Code generation aims to automatically generate code snippets of specific programming language according to natural language descriptions. The continuous advancements in deep learning, particularly pre-trained models, have empowered the code…

Software Engineering · Computer Science 2025-01-24 Zezhou Yang , Sirong Chen , Cuiyun Gao , Zhenhao Li , Xing Hu , Kui Liu , Xin Xia

With the great success of pre-trained models, the pretrain-then-finetune paradigm has been widely adopted on downstream tasks for source code understanding. However, compared to costly training a large-scale model from scratch, how to…

Software Engineering · Computer Science 2022-03-16 Deze Wang , Zhouyang Jia , Shanshan Li , Yue Yu , Yun Xiong , Wei Dong , Xiangke Liao

Nowadays, the usage of Large Language Models (LLMs) has increased, and LLMs have been used to generate texts in different languages and for different tasks. Additionally, due to the participation of remarkable companies such as Google and…

Computation and Language · Computer Science 2024-02-26 Mohammad Heydari Rad , Farhan Farsi , Shayan Bali , Romina Etezadi , Mehrnoush Shamsfard

Large pre-trained code generation models, such as OpenAI Codex, can generate syntax- and function-correct code, making the coding of programmers more productive and our pursuit of artificial general intelligence closer. In this paper, we…

Machine Learning · Computer Science 2024-07-11 Qinkai Zheng , Xiao Xia , Xu Zou , Yuxiao Dong , Shan Wang , Yufei Xue , Zihan Wang , Lei Shen , Andi Wang , Yang Li , Teng Su , Zhilin Yang , Jie Tang
‹ Prev 1 2 3 10 Next ›