Related papers: scicode-lint: Detecting Methodology Bugs in Scient…

sciwrite-lint: Verification Infrastructure for the Age of Science Vibe-Writing

Scientific papers make claims about prior work backed by citations. Verifying those citations at scale (that each cited paper exists, says what the citation claims, and is itself reliable) is structurally beyond what human review can…

Digital Libraries · Computer Science 2026-05-26 Sergey V Samsonau

SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

Discrepancies between scientific papers and their code undermine reproducibility, a concern that grows as automated research agents scale scientific output beyond human review capacity. Whether LLMs can reliably detect such discrepancies…

Computation and Language · Computer Science 2026-04-23 Tim Baumgärtner , Iryna Gurevych

Checkstyle+: Reducing Technical Debt Through The Use of Linters with LLMs

Good code style improves program readability, maintainability, and collaboration, and is an integral component of software quality. Developers, however, often cut corners when following style rules, leading to the wide adoption of tools…

Software Engineering · Computer Science 2025-10-28 Ella Dodor , Cristina V. Lopes

Is This You, LLM? Recognizing AI-written Programs with Multilingual Code Stylometry

With the increasing popularity of LLM-based code completers, like GitHub Copilot, the interest in automatically detecting AI-generated code is also increasing-in particular in contexts where the use of LLMs to program is forbidden by policy…

Software Engineering · Computer Science 2024-12-20 Andrea Gurioli , Maurizio Gabbrielli , Stefano Zacchiroli

LintLLM: An Open-Source Verilog Linting Framework Based on Large Language Models

Code Linting tools are vital for detecting potential defects in Verilog code. However, the limitations of traditional Linting tools are evident in frequent false positives and redundant defect reports. Recent advancements in large language…

Hardware Architecture · Computer Science 2025-02-18 Zhigang Fang , Renzhi Chen , Zhijie Yang , Yang Guo , Huadong Dai , Lei Wang

Code Linting using Language Models

Code linters play a crucial role in developing high-quality software systems by detecting potential problems (e.g., memory leaks) in the source code of systems. Despite their benefits, code linters are often language-specific, focused on…

Software Engineering · Computer Science 2024-07-24 Darren Holden , Nafiseh Kahani

LLM-Powered Test Case Generation for Detecting Bugs in Plausible Programs

Detecting tricky bugs in plausible programs, those that pass existing test suites yet still contain bugs, remains a significant challenge in software testing. To address this problem, we propose TrickCatcher, an LLM-powered approach to…

Software Engineering · Computer Science 2025-06-03 Kaibo Liu , Zhenpeng Chen , Yiyang Liu , Jie M. Zhang , Mark Harman , Yudong Han , Yun Ma , Yihong Dong , Ge Li , Gang Huang

Bugs in Large Language Models Generated Code: An Empirical Study

Large Language Models (LLMs) for code have gained significant attention recently. They can generate code in different programming languages based on provided prompts, fulfilling a long-lasting dream in Software Engineering (SE), i.e.,…

Software Engineering · Computer Science 2024-03-19 Florian Tambon , Arghavan Moradi Dakhel , Amin Nikanjam , Foutse Khomh , Michel C. Desmarais , Giuliano Antoniol

LLM-Based Detection of Tangled Code Changes for Higher-Quality Method-Level Bug Datasets

Tangled code changes, commits that conflate unrelated modifications such as bug fixes, refactorings, and enhancements, introduce significant noise into bug datasets and adversely affect the performance of bug prediction models. Addressing…

Software Engineering · Computer Science 2025-10-28 Md Nahidul Islam Opu , Shaowei Wang , Shaiful Chowdhury

LLMs in Coding and their Impact on the Commercial Software Engineering Landscape

Large-language-model coding tools are now mainstream in software engineering. But as these same tools move human effort up the development stack, they present fresh dangers: 10% of real prompts leak private data, 42% of generated snippets…

Software Engineering · Computer Science 2025-06-23 Vladislav Belozerov , Peter J Barclay , Askhan Sami

Understanding Defects in Generated Codes by Language Models

This study investigates the reliability of code generation by Large Language Models (LLMs), focusing on identifying and analyzing defects in the generated code. Despite the advanced capabilities of LLMs in automating code generation,…

Software Engineering · Computer Science 2024-08-27 Ali Mohammadi Esfahani , Nafiseh Kahani , Samuel A. Ajila

Assessing AI Detectors in Identifying AI-Generated Code: Implications for Education

Educators are increasingly concerned about the usage of Large Language Models (LLMs) such as ChatGPT in programming education, particularly regarding the potential exploitation of imperfections in Artificial Intelligence Generated Content…

Software Engineering · Computer Science 2024-01-09 Wei Hung Pan , Ming Jie Chok , Jonathan Leong Shan Wong , Yung Xin Shin , Yeong Shian Poon , Zhou Yang , Chun Yong Chong , David Lo , Mei Kuan Lim

The Prevalence of Code Smells in Machine Learning projects

Artificial Intelligence (AI) and Machine Learning (ML) are pervasive in the current computer science landscape. Yet, there still exists a lack of software engineering experience and best practices in this field. One such best practice,…

Software Engineering · Computer Science 2021-03-09 Bart van Oort , Luís Cruz , Maurício Aniche , Arie van Deursen

Span-level Detection of AI-generated Scientific Text via Contrastive Learning and Structural Calibration

The rapid adoption of large language models (LLMs) in scientific writing raises serious concerns regarding authorship integrity and the reliability of scholarly publications. Existing detection approaches mainly rely on document-level…

Computation and Language · Computer Science 2025-10-02 Zhen Yin , Shenghua Wang

Toward Automated and Trustworthy Scientific Analysis and Visualization with LLM-Generated Code

As modern science becomes increasingly data-intensive, the ability to analyze and visualize large-scale, complex datasets is critical to accelerating discovery. However, many domain scientists lack the programming expertise required to…

Software Engineering · Computer Science 2025-12-01 Apu Kumar Chakroborti , Yi Ding , Lipeng Wan

ChatGPT Code Detection: Techniques for Uncovering the Source of Code

In recent times, large language models (LLMs) have made significant strides in generating computer code, blurring the lines between code created by humans and code produced by artificial intelligence (AI). As these technologies evolve…

Machine Learning · Computer Science 2024-07-04 Marc Oedingen , Raphael C. Engelhardt , Robin Denz , Maximilian Hammer , Wolfgang Konen

Is LLM-Generated Code More Maintainable \& Reliable than Human-Written Code?

Background: The rise of Large Language Models (LLMs) in software development has opened new possibilities for code generation. Despite the widespread use of this technology, it remains unclear how well LLMs generate code solutions in terms…

Software Engineering · Computer Science 2025-08-04 Alfred Santa Molison , Marcia Moraes , Glaucia Melo , Fabio Santos , Wesley K. G. Assuncao

LineBreaker: Finding Token-Inconsistency Bugs with Large Language Models

Token-inconsistency bugs (TIBs) involve the misuse of syntactically valid yet incorrect code tokens, such as misused variables and erroneous function invocations, which can often lead to software bugs. Unlike simple syntactic bugs, TIBs…

Cryptography and Security · Computer Science 2025-10-14 Hongbo Chen , Yifan Zhang , Xing Han , Tianhao Mao , Huanyao Rong , Yuheng Zhang , XiaoFeng Wang , Luyi Xing , Xun Chen , Hang Zhang

Towards Automated Quality Assurance of Patent Specifications: A Multi-Dimensional LLM Framework

Although AI drafting tools have gained prominence in patent writing, the systematic evaluation of AI-generated patent content quality represents a significant research gap. To address this gap, We propose to evaluate patents using…

Information Retrieval · Computer Science 2025-10-31 Yuqian Chai , Chaochao Wang , Weilei Wang

Will It Break in Production? Metric-Driven Prediction of Residual Defects in Python Systems

Python's dynamic nature complicates testing and increases the possibility that some defects evade detection, so an effective fault prediction becomes essential. We examine whether post-release faults can be predicted using modern ML and DL.…

Software Engineering · Computer Science 2026-04-30 Giuseppe De Rosa , Pietro Liguori