English
Related papers

Related papers: RECODE-H: A Benchmark for Research Code Developmen…

200 papers

Large language models (LLMs) have proven invaluable for code generation, particularly in interactive settings. However, existing code generation benchmarks fail to capture the diverse feedback encountered in multi-turn interactions,…

Software Engineering · Computer Science 2025-02-28 Hojae Han , Seung-won Hwang , Rajhans Samdani , Yuxiong He

Effective feedback is essential for fostering students' success in scientific inquiry. With advancements in artificial intelligence, large language models (LLMs) offer new possibilities for delivering instant and adaptive feedback. However,…

Artificial Intelligence · Computer Science 2025-02-19 Kathrin Seßler , Arne Bewersdorff , Claudia Nerdel , Enkelejda Kasneci

Code repair is a fundamental task in software development, facilitating efficient bug resolution and software maintenance. Although large language models (LLMs) have demonstrated considerable potential in automated code repair, their…

Software Engineering · Computer Science 2026-02-27 Dekun Dai , MingWei Liu , Anji Li , Jialun Cao , Yanlin Wang , Chong Wang , Xin Peng , Zibin Zheng

Large language model (LLM) agents have demonstrated remarkable potential in advancing scientific discovery. However, their capability in the fundamental yet crucial task of reproducing code from research papers, especially in the NLP…

Code reproduction is a cornerstone of scientific validity, yet it remains a formidable challenge in computer networking research due to the scarcity of open-source implementations and the complexity of heterogeneous system architectures.…

Networking and Internet Architecture · Computer Science 2026-02-17 Yining Jiang , Yunxin Xu , Wenyun Xu , Yufan Zhu , Tangtang He , Haiying Huang , Letian Zhu , Qingyu Song , Qiang Su , Lizhao You , Lu Tang , Wanjin Feng , Yuchao Zhang , Linghe Kong , Qiao Xiang , Jiwu Shu

Pre-trained on massive amounts of code and text data, large language models (LLMs) have demonstrated remarkable achievements in performing code generation tasks. With additional execution-based feedback, these models can act as agents with…

Computation and Language · Computer Science 2024-11-14 Jierui Li , Hung Le , Yingbo Zhou , Caiming Xiong , Silvio Savarese , Doyen Sahoo

Large language models (LLMs) struggle to consistently generate UI code that compiles and produces visually relevant designs. Existing approaches to improve generation rely on expensive human feedback or distilling a proprietary model. In…

Computation and Language · Computer Science 2024-06-13 Jason Wu , Eldon Schoop , Alan Leung , Titus Barik , Jeffrey P. Bigham , Jeffrey Nichols

Expert feedback lays the foundation of rigorous research. However, the rapid growth of scholarly production and intricate knowledge specialization challenge the conventional scientific feedback mechanisms. High-quality peer reviews are…

Large Language Models (LLM) are increasingly used for software development, yet existing benchmarks for LLM-based coding assistance do not reflect the constraints of High Energy Physics (HEP) and High Performance Computing (HPC) software.…

Large language models (LLMs) serve as an active and promising field of generative artificial intelligence and have demonstrated abilities to perform complex tasks in multiple domains, including mathematical and scientific reasoning. In this…

Artificial Intelligence · Computer Science 2026-03-03 Ao Cheng , Lei Zhang , Guowei He

Large Language Models (LLMs) are widely used to support software developers in tasks such as code generation, optimization, and documentation. However, their ability to improve existing programming answers in a human-like manner remains…

Software Engineering · Computer Science 2026-01-27 Suborno Deb Bappon , Saikat Mondal , Chanchal K. Roy , Kevin Schneider

Automatic code generation has gained significant momentum with the advent of Large Language Models (LLMs) such as GPT-4. Although many studies focus on improving the effectiveness of LLMs for code generation, very limited work tries to…

Software Engineering · Computer Science 2025-06-02 Melika Sepidband , Hamed Taherkhani , Song Wang , Hadi Hemmati

Large Language Models (LLMs) exhibit remarkable code generation capabilities but falter when adapting to frequent updates in external library APIs. This critical limitation, stemming from reliance on outdated API knowledge from their…

Computation and Language · Computer Science 2025-11-25 Haoze Wu , Yunzhi Yao , Wenhao Yu , Ningyu Zhang

Collaboration is the defining mode of modern science, yet its core mechanism -- feedback -- remains hard to observe, difficult to scale, and unequally distributed. Here we test whether large language models (LLMs) can contribute to this…

Physics and Society · Physics 2026-05-26 Binglu Wang , Weixin Liang , Jiahui Xue , Yuhui Zhang , Hancheng Cao , Dashun Wang , Yian Yin

Recently, a number of repository-level code generation benchmarks-such as CoderEval, DevEval, RepoEval, RepoBench, and LongCodeArena-have emerged to evaluate the capabilities of large language models (LLMs) beyond standalone benchmarks like…

Software Engineering · Computer Science 2025-06-26 Shanchao Liang , Yiran Hu , Nan Jiang , Lin Tan

Large Language Models (LLMs) have shown promise in automated code generation but typically excel only in simpler tasks such as generating standalone code units. Real-world software development, however, often involves complex code…

Software Engineering · Computer Science 2024-08-12 Kechi Zhang , Jia Li , Ge Li , Xianjie Shi , Zhi Jin

Large Language Models (LLMs) have demonstrated their remarkable capabilities in numerous fields. This survey focuses on how LLMs empower users, regardless of their technical background, to use human languages to automatically generate…

Software Engineering · Computer Science 2025-04-03 Nam Huynh , Beiyu Lin

Large language models (LLMs) have increasingly been applied to automatic programming code generation. This task can be viewed as a language generation task that bridges natural language, human knowledge, and programming logic. However, it…

This study evaluates large language models (LLMs) in generating code from algorithm descriptions in recent NLP papers. The task requires two key competencies: (1) algorithm comprehension: synthesizing information from papers and academic…

Computation and Language · Computer Science 2025-08-08 Yanzheng Xiang , Hanqi Yan , Shuyin Ouyang , Lin Gui , Yulan He

Large language models (LLMs) deployed as agents solve user-specified tasks over multiple steps while keeping the required manual engagement to a minimum. Crucially, such LLMs need to ground their generations in any feedback obtained to…

Computation and Language · Computer Science 2025-02-19 Jonas Gehring , Kunhao Zheng , Jade Copet , Vegard Mella , Quentin Carbonneaux , Taco Cohen , Gabriel Synnaeve
‹ Prev 1 2 3 10 Next ›