English
Related papers

Related papers: LLMDFA: Analyzing Dataflow in Code with Large Lang…

200 papers

Large Language Models (LLMs) have made significant progress in code generation, offering developers groundbreaking automated programming support. However, LLMs often generate code that is syntactically correct and even semantically…

Computation and Language · Computer Science 2025-01-22 Yuchen Tian , Weixiang Yan , Qian Yang , Xuandong Zhao , Qian Chen , Wen Wang , Ziyang Luo , Lei Ma , Dawn Song

Privacy policies help inform people about organisations' personal data processing practices, covering different aspects such as data collection, data storage, and sharing of personal data with third parties. Privacy policies are often…

Artificial Intelligence · Computer Science 2026-01-16 Haiyue Yuan , Nikolay Matyunin , Ali Raza , Shujun Li

The rapidly growing demand for high-quality data in Large Language Models (LLMs) has intensified the need for scalable, reliable, and semantically rich data preparation pipelines. However, current practices remain dominated by ad-hoc…

Deep learning-based vulnerability detection has shown great performance and, in some studies, outperformed static analysis tools. However, the highest-performing approaches use token-based transformer models, which are not the most…

Software Engineering · Computer Science 2023-10-03 Benjamin Steenhoek , Hongyang Gao , Wei Le

Conducting data analysis typically involves authoring code to transform, visualize, analyze, and interpret data. Large language models (LLMs) are now capable of generating such code for simple, routine analyses. LLMs promise to democratize…

Human-Computer Interaction · Computer Science 2025-04-22 Stephen N. Freund , Brooke Simon , Emery D. Berger , Eunice Jun

Android apps rely heavily on Data Manipulation Functionalities (DMFs) for handling app-specific data through CRUDS operations, making their correctness vital for reliability. However, detecting Data Manipulation Errors (DMEs) is challenging…

Software Engineering · Computer Science 2026-05-12 Xiangyang Xiao , Huaxun Huang , Rongxin Wu

Large Language Models (LLMs) have become increasingly important in natural language processing, enabling advanced data analytics through natural language queries. However, these models often generate "hallucinations"-inaccurate or…

Computation and Language · Computer Science 2024-10-29 Mikhail Rumiantsau , Aliaksei Vertsel , Ilya Hrytsuk , Isaiah Ballah

Despite recent advancements in large language models (LLMs), their performance on complex reasoning problems requiring multi-step thinking and combining various skills is still limited. To address this, we propose a novel framework HDFlow…

Computation and Language · Computer Science 2024-09-27 Wenlin Yao , Haitao Mi , Dong Yu

Fault Localization (FL) aims to automatically localize buggy lines of code, a key first step in many manual and automatic debugging tasks. Previous FL techniques assume the provision of input tests, and often require extensive program…

Software Engineering · Computer Science 2023-10-04 Aidan Z. H. Yang , Ruben Martins , Claire Le Goues , Vincent J. Hellendoorn

To mitigate hallucinations in large language models (LLMs), we propose a framework that focuses on errors induced by prompts. Our method extends a chain-style knowledge distillation approach by incorporating a programmable module that…

Computation and Language · Computer Science 2026-01-08 Jinbo Hao , Kai Yang , Qingzhen Su , Yifan Li , Chao Jiang

Fault Localization (FL), in which a developer seeks to identify which part of the code is malfunctioning and needs to be fixed, is a recurring challenge in debugging. To reduce developer burden, many automated FL techniques have been…

Software Engineering · Computer Science 2024-07-03 Sungmin Kang , Gabin An , Shin Yoo

Large Language Models (LLMs) are increasingly deployed in automated software engineering for tasks such as API migration. While LLMs are able to identify migration patterns, they often make mistakes and fail to produce correct glue code to…

Software Engineering · Computer Science 2026-04-23 Marcos Tileria , Santanu Kumar Dash , Profir-Petru Pârţachi , Earl T. Barr

The reasoning abilities are one of the most enigmatic and captivating aspects of large language models (LLMs). Numerous studies are dedicated to exploring and expanding the boundaries of this reasoning capability. However, tasks that embody…

Artificial Intelligence · Computer Science 2025-02-27 Yuze Zhao , Tianyun Ji , Wenjun Feng , Zhenya Huang , Qi Liu , Zhiding Liu , Yixiao Ma , Kai Zhang , Enhong Chen

Large language models (LLMs) trained on datasets of publicly available source code have established a new state of the art in code generation tasks. However, these models are mostly unaware of the code that exists within a specific project,…

Software Engineering · Computer Science 2024-06-21 Aryaz Eghbali , Michael Pradel

Model hallucination is one of the most critical challenges faced by Large Language Models (LLMs), especially in high-stakes code intelligence tasks. As LLMs become increasingly integrated into software engineering tasks, understanding and…

Software Engineering · Computer Science 2025-11-04 Cuiyun Gao , Guodong Fan , Chun Yong Chong , Shizhan Chen , Chao Liu , David Lo , Zibin Zheng , Qing Liao

Large language models (LLMs) are leading significant progress in code generation. Beyond one-pass code generation, recent works further integrate unit tests and program verifiers into LLMs to iteratively refine the generated programs.…

Software Engineering · Computer Science 2024-06-12 Li Zhong , Zilong Wang , Jingbo Shang

Large language models (LLMs) have revolutionized natural language processing, yet their propensity for hallucination, generating plausible but factually incorrect or fabricated content, remains a critical challenge. This report provides a…

Computation and Language · Computer Science 2025-08-05 Manuel Cossio

Code completion entails the task of providing missing tokens given a surrounding context. It can boost developer productivity while providing a powerful code discovery tool. Following the Large Language Model (LLM) wave, code completion has…

Software Engineering · Computer Science 2026-04-30 Zoe Kotti , Konstantina Dritsa , Diomidis Spinellis , Panos Louridas

We introduce MPLSandbox, an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler and analysis tools for Large Language Models (LLMs). It can automatically identify the…

Data profiling is critical in machine learning for generating descriptive statistics, supporting both deeper understanding and downstream tasks like data valuation and curation. This work addresses profiling specifically in the context of…

Software Engineering · Computer Science 2025-03-21 Pankaj Thorat , Adnan Qidwai , Adrija Dhar , Aishwariya Chakraborty , Anand Eswaran , Hima Patel , Praveen Jayachandran
‹ Prev 1 2 3 10 Next ›