Related papers: Do Large Code Models Understand Programming Concep…

Strengthening Programming Comprehension in Large Language Models through Code Generation

Large language models (LLMs) have recently shown impressive results on diverse code-related tasks, benefiting from large-scale training and instruction tuning. However, studies reveal that their grasp of fundamental programming concepts,…

Software Engineering · Computer Science 2025-08-19 Xiaoning Ren , Qiang Hu , Wei Ma , Yan Li , Yao Zhang , Lingxiao Jiang , Yinxing Xue

Do Code LLMs Understand Design Patterns?

Code Large Language Models (LLMs) demonstrate great versatility in adapting to various downstream tasks, including code generation and completion, as well as bug detection and fixing. However, Code LLMs often fail to capture existing coding…

Software Engineering · Computer Science 2025-01-10 Zhenyu Pan , Xuefeng Song , Yunkun Wang , Rongyu Cao , Binhua Li , Yongbin Li , Han Liu

CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models

With the emergence of Large Language Models (LLMs), there has been a significant improvement in the programming capabilities of models, attracting growing attention from researchers. Evaluating the programming capabilities of LLMs is…

Computation and Language · Computer Science 2024-03-12 Lingyue Fu , Huacan Chai , Shuang Luo , Kounianhua Du , Weiming Zhang , Longteng Fan , Jiayi Lei , Renting Rui , Jianghao Lin , Yuchen Fang , Yifan Liu , Jingkuan Wang , Siyuan Qi , Kangning Zhang , Weinan Zhang , Yong Yu

The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations?

While language models are increasingly more proficient at code generation, they still frequently generate incorrect programs. Many of these programs are obviously wrong, but others are more subtle and pass weaker correctness checks such as…

Software Engineering · Computer Science 2024-03-01 Alex Gu , Wen-Ding Li , Naman Jain , Theo X. Olausson , Celine Lee , Koushik Sen , Armando Solar-Lezama

Benchmarking Large Language Models for ABAP Code Generation: An Empirical Study on Iterative Improvement by Compiler Feedback

This work investigates the performance of Large Language Models (LLMs) in generating ABAP code. Despite successful applications of generative AI in many programming languages, there are hardly any systematic analyses of ABAP code generation…

Software Engineering · Computer Science 2026-01-22 Stephan Wallraven , Tim Köhne , Hartmut Westenberger , Andreas Moser

Coding Triangle: How Does Large Language Model Understand Code?

Large language models (LLMs) have achieved remarkable progress in code generation, yet their true programming competence remains underexplored. We introduce the Code Triangle framework, which systematically evaluates LLMs across three…

Computation and Language · Computer Science 2025-07-09 Taolin Zhang , Zihan Ma , Maosong Cao , Junnan Liu , Songyang Zhang , Kai Chen

Pop Quiz! Can a Large Language Model Help With Reverse Engineering?

Large language models (such as OpenAI's Codex) have demonstrated impressive zero-shot multi-task capabilities in the software domain, including code explanation. In this work, we examine if this ability can be used to help with reverse…

Software Engineering · Computer Science 2022-02-03 Hammond Pearce , Benjamin Tan , Prashanth Krishnamurthy , Farshad Khorrami , Ramesh Karri , Brendan Dolan-Gavitt

On the Reliability and Explainability of Language Models for Program Generation

Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and…

Software Engineering · Computer Science 2024-01-09 Yue Liu , Chakkrit Tantithamthavorn , Yonghui Liu , Li Li

Counterfactual Explanation Generation with s(CASP)

Machine learning models that automate decision-making are increasingly being used in consequential areas such as loan approvals, pretrial bail, hiring, and many more. Unfortunately, most of these models are black-boxes, i.e., they are…

Artificial Intelligence · Computer Science 2023-10-24 Sopam Dasgupta , Farhad Shakerin , Joaquín Arias , Elmer Salazar , Gopal Gupta

Counterfactual Explanations for Models of Code

Machine learning (ML) models play an increasingly prevalent role in many software engineering tasks. However, because most models are now powered by opaque deep neural networks, it can be difficult for developers to understand why the model…

Software Engineering · Computer Science 2021-11-11 Jürgen Cito , Isil Dillig , Vijayaraghavan Murali , Satish Chandra

When Prompts Go Wrong: Evaluating Code Model Robustness to Ambiguous, Contradictory, and Incomplete Task Descriptions

Large Language Models (LLMs) have demonstrated impressive performance in code generation tasks under idealized conditions, where task descriptions are clear and precise. However, in practice, task descriptions frequently exhibit ambiguity,…

Software Engineering · Computer Science 2025-07-29 Maya Larbi , Amal Akli , Mike Papadakis , Rihab Bouyousfi , Maxime Cordy , Federica Sarro , Yves Le Traon

Can Large Language Models Follow Concept Annotation Guidelines? A Case Study on Scientific and Financial Domains

Although large language models (LLMs) exhibit remarkable capacity to leverage in-context demonstrations, it is still unclear to what extent they can learn new concepts or facts from ground-truth labels. To address this question, we examine…

Computation and Language · Computer Science 2024-06-28 Marcio Fonseca , Shay B. Cohen

Large Language Models are Algorithmically Blind

Large language models (LLMs) demonstrate remarkable breadth of knowledge, yet their ability to reason about computational processes remains poorly understood. Closing this gap matters for practitioners who rely on LLMs to guide algorithm…

Computation and Language · Computer Science 2026-04-07 Sohan Venkatesh , Ashish Mahendran Kurapath , Tejas Melkote

Programming Language Confusion: When Code LLMs Can't Keep their Languages Straight

Large Language Models (LLMs) have achieved state-of-the-art performance across software engineering tasks, from code generation to translation. However, we identify and systematically evaluate a critical failure mode: Programming Language…

Software Engineering · Computer Science 2026-02-03 Micheline Bénédicte Moumoula , Serge Lionel Nikiema , Abdoul Kader Kabore , Jacques Klein , Tegawendé F. Bissyande

Counterfactual Generation with Answer Set Programming

Machine learning models that automate decision-making are increasingly being used in consequential areas such as loan approvals, pretrial bail approval, hiring, and many more. Unfortunately, most of these models are black-boxes, i.e., they…

Artificial Intelligence · Computer Science 2024-02-08 Sopam Dasgupta , Farhad Shakerin , Joaquín Arias , Elmer Salazar , Gopal Gupta

Empowering Language Understanding with Counterfactual Reasoning

Present language understanding methods have demonstrated extraordinary ability of recognizing patterns in texts via machine learning. However, existing methods indiscriminately use the recognized patterns in the testing phase that is…

Computation and Language · Computer Science 2021-06-08 Fuli Feng , Jizhi Zhang , Xiangnan He , Hanwang Zhang , Tat-Seng Chua

Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning

Despite the remarkable success of large language models (LLMs) on traditional natural language processing tasks, their planning ability remains a critical bottleneck in tackling complex multi-step reasoning tasks. Existing approaches mainly…

Computation and Language · Computer Science 2024-10-07 Jiaxin Wen , Jian Guan , Hongning Wang , Wei Wu , Minlie Huang

Large Language Models for Code Analysis: Do LLMs Really Do Their Job?

Large language models (LLMs) have demonstrated significant potential in the realm of natural language understanding and programming code processing tasks. Their capacity to comprehend and generate human-like code has spurred research into…

Software Engineering · Computer Science 2024-03-07 Chongzhou Fang , Ning Miao , Shaurya Srivastav , Jialin Liu , Ruoyu Zhang , Ruijie Fang , Asmita , Ryan Tsang , Najmeh Nazari , Han Wang , Houman Homayoun

The Code Barrier: What LLMs Actually Understand?

Understanding code represents a core ability needed for automating software development tasks. While foundation models like LLMs show impressive results across many software engineering challenges, the extent of their true semantic…

Software Engineering · Computer Science 2025-04-16 Serge Lionel Nikiema , Jordan Samhi , Abdoul Kader Kaboré , Jacques Klein , Tegawendé F. Bissyandé

Do Large Language Models Understand Performance Optimization?

Large Language Models (LLMs) have emerged as powerful tools for software development tasks such as code completion, translation, and optimization. However, their ability to generate efficient and correct code, particularly in complex…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-19 Bowen Cui , Tejas Ramesh , Oscar Hernandez , Keren Zhou