Related papers: Programming Language Confusion: When Code LLMs Can…

Exploring Multi-Lingual Bias of Large Code Models in Code Generation

Code generation aims to synthesize code and fulfill functional requirements based on natural language (NL) specifications, which can greatly improve development efficiency. In the era of large language models (LLMs), large code models…

Software Engineering · Computer Science 2024-05-01 Chaozheng Wang , Zongjie Li , Cuiyun Gao , Wenxuan Wang , Ting Peng , Hailiang Huang , Yuetang Deng , Shuai Wang , Michael R. Lyu

Uncovering Systematic Failures of LLMs in Verifying Code Against Natural Language Specifications

Large language models (LLMs) have become essential tools in software development, widely used for requirements engineering, code generation and review tasks. Software engineers often rely on LLMs to assess whether system code implementation…

Software Engineering · Computer Science 2025-08-19 Haolin Jin , Huaming Chen

Do Large Language Models Pay Similar Attention Like Human Programmers When Generating Code?

Large Language Models (LLMs) have recently been widely used for code generation. Due to the complexity and opacity of LLMs, little is known about how these models generate code. We made the first attempt to bridge this knowledge gap by…

Software Engineering · Computer Science 2024-05-24 Bonan Kou , Shengmai Chen , Zhijie Wang , Lei Ma , Tianyi Zhang

Understanding and Mitigating Language Confusion in LLMs

We investigate a surprising limitation of LLMs: their inability to consistently generate text in a user's desired language. We create the Language Confusion Benchmark (LCB) to evaluate such failures, covering 15 typologically diverse…

Computation and Language · Computer Science 2025-04-07 Kelly Marchisio , Wei-Yin Ko , Alexandre Bérard , Théo Dehaze , Sebastian Ruder

From Effectiveness to Efficiency: Uncovering Linguistic Bias in Large Language Model-based Code Generation

Large Language Models (LLMs) have demonstrated promising capabilities for code generation. While existing benchmarks evaluate the correctness and efficiency of LLM-generated code, the potential linguistic bias - where code quality varies…

Software Engineering · Computer Science 2025-05-02 Weipeng Jiang , Xuanqi Gao , Juan Zhai , Shiqing Ma , Xiaoyu Zhang , Ziyan Lei , Chao Shen

Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code

Code translation aims to convert source code from one programming language (PL) to another. Given the promising abilities of large language models (LLMs) in code synthesis, researchers are exploring their potential to automate code…

Software Engineering · Computer Science 2024-01-17 Rangeet Pan , Ali Reza Ibrahimzada , Rahul Krishna , Divya Sankar , Lambert Pouguem Wassi , Michele Merler , Boris Sobolev , Raju Pavuluri , Saurabh Sinha , Reyhaneh Jabbarvand

Large Language Models are Easily Confused: A Quantitative Metric, Security Implications and Typological Analysis

Language Confusion is a phenomenon where Large Language Models (LLMs) generate text that is neither in the desired language, nor in a contextually appropriate language. This phenomenon presents a critical challenge in text generation by…

Computation and Language · Computer Science 2025-02-11 Yiyi Chen , Qiongxiu Li , Russa Biswas , Johannes Bjerva

Perplexed: Understanding When Large Language Models are Confused

Large Language Models (LLMs) have become dominant in the Natural Language Processing (NLP) field causing a huge surge in progress in a short amount of time. However, their limitations are still a mystery and have primarily been explored…

Software Engineering · Computer Science 2024-04-11 Nathan Cooper , Torsten Scholak

Uncertainty Awareness of Large Language Models Under Code Distribution Shifts: A Benchmark Study

Large Language Models (LLMs) have been widely employed in programming language analysis to enhance human productivity. Yet, their reliability can be compromised by various code distribution shifts, leading to inconsistent outputs. While…

Software Engineering · Computer Science 2024-02-12 Yufei Li , Simin Chen , Yanghong Guo , Wei Yang , Yue Dong , Cong Liu

Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models

Language confusion -- where large language models (LLMs) generate unintended languages against the user's need -- remains a critical challenge, especially for English-centric models. We present the first mechanistic interpretability (MI)…

Computation and Language · Computer Science 2025-09-19 Ercong Nie , Helmut Schmid , Hinrich Schütze

When Prompts Go Wrong: Evaluating Code Model Robustness to Ambiguous, Contradictory, and Incomplete Task Descriptions

Large Language Models (LLMs) have demonstrated impressive performance in code generation tasks under idealized conditions, where task descriptions are clear and precise. However, in practice, task descriptions frequently exhibit ambiguity,…

Software Engineering · Computer Science 2025-07-29 Maya Larbi , Amal Akli , Mike Papadakis , Rihab Bouyousfi , Maxime Cordy , Federica Sarro , Yves Le Traon

Assessing the Impact of Code Changes on the Fault Localizability of Large Language Models

Generative Large Language Models (LLMs) are increasingly used in non-generative software maintenance tasks, such as fault localization (FL). Success in FL depends on a models ability to reason about program semantics beyond surface-level…

Software Engineering · Computer Science 2026-03-06 Sabaat Haroon , Ahmad Faraz Khan , Ahmad Humayun , Waris Gill , Abdul Haddi Amjad , Ali R. Butt , Mohammad Taha Khan , Muhammad Ali Gulzar

Are LLMs Reliable Code Reviewers? Systematic Overcorrection in Requirement Conformance Judgement

Large language models (LLMs) have become essential tools in software development, widely used for requirements engineering, code generation and review tasks. Software engineers often rely on LLMs to verify if code implementation satisfy…

Software Engineering · Computer Science 2026-03-03 Haolin Jin , Huaming Chen

Towards Understanding the Characteristics of Code Generation Errors Made by Large Language Models

Large Language Models (LLMs) have demonstrated unprecedented capabilities in code generation. However, there remains a limited understanding of code generation errors that LLMs can produce. To bridge the gap, we conducted an in-depth…

Software Engineering · Computer Science 2025-02-14 Zhijie Wang , Zijie Zhou , Da Song , Yuheng Huang , Shengmai Chen , Lei Ma , Tianyi Zhang

Beyond Translation Accuracy: Addressing False Failures in LLM-Based Code Translation

Large Language Models (LLMs) have achieved remarkable success in automated code translation. While prior work has focused on improving translation accuracy through advanced prompting and iterative repair, the reliability of the underlying…

Software Engineering · Computer Science 2026-05-11 Fazle Rabbi , Soumit Kanti Saha , Jinqiu Yang

When the Code Autopilot Breaks: Why LLMs Falter in Embedded Machine Learning

Large Language Models (LLMs) are increasingly used to automate software generation in embedded machine learning workflows, yet their outputs often fail silently or behave unpredictably. This article presents an empirical investigation of…

Software Engineering · Computer Science 2025-09-16 Roberto Morabito , Guanghan Wu

Understanding Defects in Generated Codes by Language Models

This study investigates the reliability of code generation by Large Language Models (LLMs), focusing on identifying and analyzing defects in the generated code. Despite the advanced capabilities of LLMs in automating code generation,…

Software Engineering · Computer Science 2024-08-27 Ali Mohammadi Esfahani , Nafiseh Kahani , Samuel A. Ajila

A Deep Dive Into Large Language Model Code Generation Mistakes: What and Why?

Recent advancements in Large Language Models (LLMs) have led to their widespread application in automated code generation. However, these models can still generate defective code that deviates from the specification. Previous research has…

Software Engineering · Computer Science 2025-03-21 QiHong Chen , Jiachen Yu , Jiawei Li , Jiecheng Deng , Justin Tian Jin Chen , Iftekhar Ahmed

Bugs in Large Language Models Generated Code: An Empirical Study

Large Language Models (LLMs) for code have gained significant attention recently. They can generate code in different programming languages based on provided prompts, fulfilling a long-lasting dream in Software Engineering (SE), i.e.,…

Software Engineering · Computer Science 2024-03-19 Florian Tambon , Arghavan Moradi Dakhel , Amin Nikanjam , Foutse Khomh , Michel C. Desmarais , Giuliano Antoniol

CrossPL: Evaluating Large Language Models on Cross Programming Language Code Generation

As large language models (LLMs) become increasingly embedded in software engineering workflows, a critical capability remains underexplored: generating correct code that enables cross-programming-language (CPL) interoperability. This skill…

Software Engineering · Computer Science 2025-07-29 Zhanhang Xiong , Dongxia Wang , Yuekang Li , Xinyuan An , Wenhai Wang