English
Related papers

Related papers: Enabling BLV Developers with LLM-driven Code Debug…

200 papers

Large language models (LLMs) are leading significant progress in code generation. Beyond one-pass code generation, recent works further integrate unit tests and program verifiers into LLMs to iteratively refine the generated programs.…

Software Engineering · Computer Science 2024-06-12 Li Zhong , Zilong Wang , Jingbo Shang

Large Language Models (LLMs) have demonstrated exceptional coding capability. However, as another critical component of programming proficiency, the debugging capability of LLMs remains relatively unexplored. Previous evaluations of LLMs'…

Software Engineering · Computer Science 2024-06-07 Runchu Tian , Yining Ye , Yujia Qin , Xin Cong , Yankai Lin , Yinxu Pan , Yesai Wu , Haotian Hui , Weichuan Liu , Zhiyuan Liu , Maosong Sun

Debugging is a critical but challenging task for programmers. This paper proposes ChatDBG, an AI-powered debugging assistant. ChatDBG integrates large language models (LLMs) to significantly enhance the capabilities and user-friendliness of…

Software Engineering · Computer Science 2025-06-23 Kyla H. Levin , Nicolas van Kempen , Emery D. Berger , Stephen N. Freund

Large language models have shown good potential in supporting software development tasks. This is why more and more developers turn to LLMs (e.g. ChatGPT) to support them in fixing their buggy code. While this can save time and effort, many…

Software Engineering · Computer Science 2024-09-06 Yacine Majdoub , Eya Ben Charrada

Automating hardware design could obviate a significant amount of human error from the engineering process and lead to fewer errors. Verilog is a popular hardware description language to model and design digital systems, thus generating…

Programming Languages · Computer Science 2022-12-22 Shailja Thakur , Baleegh Ahmad , Zhenxing Fan , Hammond Pearce , Benjamin Tan , Ramesh Karri , Brendan Dolan-Gavitt , Siddharth Garg

Background: Bug reports are essential to the software development life cycle. They help developers track and resolve issues, but are often difficult to process due to their complexity, which can delay resolution and affect software quality.…

Software Engineering · Computer Science 2025-04-30 Zhiyuan Chen , Vanessa Nava-Camal , Ahmad Suleiman , Yiming Tang , Daqing Hou , Weiyi Shang

Code large language models (LLMs) have made significant progress in code debugging by directly generating the correct code based on the buggy code snippet. Programming benchmarks, typically consisting of buggy code snippet and their…

The rapid development of large language models (LLMs), such as ChatGPT, has revolutionized the efficiency of creating programming tutorials. LLMs can be instructed with text prompts to generate comprehensive text descriptions of code…

Human-Computer Interaction · Computer Science 2024-10-29 Yihan Liu , Zhen Wen , Luoxuan Weng , Ollie Woodman , Yi Yang , Wei Chen

We introduce SimulBench, a benchmark designed to evaluate large language models (LLMs) across a diverse collection of creative simulation scenarios, such as acting as a Linux terminal or playing text games with users. While these simulation…

Computation and Language · Computer Science 2024-09-13 Qi Jia , Xiang Yue , Tianyu Zheng , Jie Huang , Bill Yuchen Lin

DevBench is a telemetry-driven benchmark designed to evaluate Large Language Models (LLMs) on realistic code completion tasks. It includes 1,800 evaluation instances across six programming languages and six task categories derived from real…

Machine Learning · Computer Science 2026-05-19 Adarsh Kumarappan , Pareesa Ameneh Golnari , Wen Wen , Xiaoyu Liu , Gabriel Ryan , Yuting Sun , Shengyu Fu , Elsie Nallipogu

Evaluating Large Language Models (LLMs) is one of the most critical aspects of building a performant compound AI system. Since the output from LLMs propagate to downstream steps, identifying LLM errors is crucial to system performance. A…

AI-enabled features built on LLMs and agentic workflows are difficult to test, debug, and reproduce, especially for product-focused software engineers without a machine learning background. We present the AI Toolkit plugin for JetBrains…

Software Engineering · Computer Science 2026-05-15 Yaroslav Sokolov , Yury Khudyakov , Lenar Sharipov , Andrei Gasparian , Parth Tiwary , Artem Trofimov

LLM-based assistants, such as GitHub Copilot and ChatGPT, have the potential to generate code that fulfills a programming task described in a natural language description, referred to as a prompt. The widespread accessibility of these…

Software Engineering · Computer Science 2024-05-24 Sylvain Kouemo Ngassom , Arghavan Moradi Dakhel , Florian Tambon , Foutse Khomh

Bug reports contain the information developers need to triage and fix software bugs. However, unclear, incomplete, or ambiguous information may lead to delays and excessive manual effort spent on bug triage and resolution. In this paper, we…

Software Engineering · Computer Science 2025-04-29 Jagrit Acharya , Gouri Ginde

Training large language models (LLMs) on Python execution traces grounds them in code execution and enables the line-by-line execution prediction of whole Python programs, effectively turning them into neural interpreters (FAIR CodeGen Team…

Machine Learning · Computer Science 2026-03-11 Maximilian Beck , Jonas Gehring , Jannik Kossen , Gabriel Synnaeve

Large Language Models for Code (or code LLMs) are increasingly gaining popularity and capabilities, offering a wide array of functionalities such as code completion, code generation, code summarization, test generation, code translation,…

Software Engineering · Computer Science 2024-10-18 Rahul Krishna , Rangeet Pan , Raju Pavuluri , Srikanth Tamilselvam , Maja Vukovic , Saurabh Sinha

In the development and maintenance of Android apps, the quick and accurate reproduction of user-reported bugs is crucial to ensure application quality and improve user satisfaction. However, this process is often time-consuming and complex.…

Software Engineering · Computer Science 2026-04-01 Xiangyang Xiao , Huaxun Huang , Rongxin Wu

Empirical research on code review processes is increasingly central to understanding software quality and collaboration. However, collecting and analyzing review data remains a time-consuming and technically intensive task. Most researchers…

Software Engineering · Computer Science 2025-10-07 Samah Kansab , Francis Bordeleau , Ali Tizghadam

Large language models (LLMs) show promise in medical diagnosis, but real-world deployment remains challenging due to high-stakes clinical decisions and imperfect reasoning reliability. As a result, careful inspection of model behavior is…

Computation and Language · Computer Science 2026-04-28 Yurui Xiang , Xingyi Mao , Rui Sheng , Zixin Chen , Zelin Zang , Yuyang Wu , Haipeng Zeng , Huamin Qu , Yushi Sun , Yanna Lin

This paper presents libRoadRunner, an extensible, high-performance, cross-platform, open-source software library for the simulation and analysis of models \ expressed using Systems Biology Markup Language (SBML). SBML is the most widely…

Subcellular Processes · Quantitative Biology 2015-03-04 Endre T. Somogyi , Jean-Marie Bouteiller , James A. Glazier , Matthias König , Kyle Medley , Maciej H. Swat , Herbert M. Sauro
‹ Prev 1 2 3 10 Next ›