Related papers: Do Large Code Models Understand Programming Concep…
Large language models (LLMs) have recently shown impressive results on diverse code-related tasks, benefiting from large-scale training and instruction tuning. However, studies reveal that their grasp of fundamental programming concepts,…
Code Large Language Models (LLMs) demonstrate great versatility in adapting to various downstream tasks, including code generation and completion, as well as bug detection and fixing. However, Code LLMs often fail to capture existing coding…
With the emergence of Large Language Models (LLMs), there has been a significant improvement in the programming capabilities of models, attracting growing attention from researchers. Evaluating the programming capabilities of LLMs is…
While language models are increasingly more proficient at code generation, they still frequently generate incorrect programs. Many of these programs are obviously wrong, but others are more subtle and pass weaker correctness checks such as…
This work investigates the performance of Large Language Models (LLMs) in generating ABAP code. Despite successful applications of generative AI in many programming languages, there are hardly any systematic analyses of ABAP code generation…
Large language models (LLMs) have achieved remarkable progress in code generation, yet their true programming competence remains underexplored. We introduce the Code Triangle framework, which systematically evaluates LLMs across three…
Large language models (such as OpenAI's Codex) have demonstrated impressive zero-shot multi-task capabilities in the software domain, including code explanation. In this work, we examine if this ability can be used to help with reverse…
Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and…
Machine learning models that automate decision-making are increasingly being used in consequential areas such as loan approvals, pretrial bail, hiring, and many more. Unfortunately, most of these models are black-boxes, i.e., they are…
Machine learning (ML) models play an increasingly prevalent role in many software engineering tasks. However, because most models are now powered by opaque deep neural networks, it can be difficult for developers to understand why the model…
Large Language Models (LLMs) have demonstrated impressive performance in code generation tasks under idealized conditions, where task descriptions are clear and precise. However, in practice, task descriptions frequently exhibit ambiguity,…
Although large language models (LLMs) exhibit remarkable capacity to leverage in-context demonstrations, it is still unclear to what extent they can learn new concepts or facts from ground-truth labels. To address this question, we examine…
Large language models (LLMs) demonstrate remarkable breadth of knowledge, yet their ability to reason about computational processes remains poorly understood. Closing this gap matters for practitioners who rely on LLMs to guide algorithm…
Large Language Models (LLMs) have achieved state-of-the-art performance across software engineering tasks, from code generation to translation. However, we identify and systematically evaluate a critical failure mode: Programming Language…
Machine learning models that automate decision-making are increasingly being used in consequential areas such as loan approvals, pretrial bail approval, hiring, and many more. Unfortunately, most of these models are black-boxes, i.e., they…
Present language understanding methods have demonstrated extraordinary ability of recognizing patterns in texts via machine learning. However, existing methods indiscriminately use the recognized patterns in the testing phase that is…
Despite the remarkable success of large language models (LLMs) on traditional natural language processing tasks, their planning ability remains a critical bottleneck in tackling complex multi-step reasoning tasks. Existing approaches mainly…
Large language models (LLMs) have demonstrated significant potential in the realm of natural language understanding and programming code processing tasks. Their capacity to comprehend and generate human-like code has spurred research into…
Understanding code represents a core ability needed for automating software development tasks. While foundation models like LLMs show impressive results across many software engineering challenges, the extent of their true semantic…
Large Language Models (LLMs) have emerged as powerful tools for software development tasks such as code completion, translation, and optimization. However, their ability to generate efficient and correct code, particularly in complex…