English
Related papers

Related papers: CL4SE: Benchmarking Context Learning on Software E…

200 papers

Code review is a cornerstone of software quality assurance, and recent advances in Large Language Models (LLMs) have shown promise in its automation. However, existing benchmarks for LLM-based code review face three major limitations. Lack…

Software Engineering · Computer Science 2026-01-01 Ruida Hu , Xinchen Wang , Xin-Cheng Wen , Zhao Zhang , Bo Jiang , Pengfei Gao , Chao Peng , Cuiyun Gao

Large language models are increasingly used as coding agents for software engineering tasks. Current benchmarks mainly evaluate whether the agent can correctly solve the request or fix the bugs. They largely treat tasks as independent and…

Software Engineering · Computer Science 2026-05-07 Jiayuan Zhu , Junde Wu , Minhao Hu , Shengda Zhu , Jiazhen Pan , Weixiang Shen , Yijun Yang , Fenglin Liu , Jianye Hao , Yueming Jin , Qirong Ho , Min Xu

Large language models (LLMs) are gaining increasing popularity in software engineering (SE) due to their unprecedented performance across various applications. These models are increasingly being utilized for a range of SE tasks, including…

Software Engineering · Computer Science 2025-11-05 Xing Hu , Feifei Niu , Junkai Chen , Xin Zhou , Junwei Zhang , Junda He , Xin Xia , David Lo

Large Language Models (LLMs) have significantly impacted numerous domains, including Software Engineering (SE). Many recent publications have explored LLMs applied to various SE tasks. Nevertheless, a comprehensive understanding of the…

Software Engineering · Computer Science 2024-04-11 Xinyi Hou , Yanjie Zhao , Yue Liu , Zhou Yang , Kailong Wang , Li Li , Xiapu Luo , David Lo , John Grundy , Haoyu Wang

Software Engineering (SE) is the systematic design, development, maintenance, and management of software applications underpinning the digital infrastructure of our modern world. Very recently, the SE community has seen a rapidly increasing…

Software Engineering · Computer Science 2024-09-10 Quanjun Zhang , Chunrong Fang , Yang Xie , Yaxin Zhang , Yun Yang , Weisong Sun , Shengcheng Yu , Zhenyu Chen

Understanding context is key to understanding human language, an ability which Large Language Models (LLMs) have been increasingly seen to demonstrate to an impressive extent. However, though the evaluation of LLMs encompasses various…

Computation and Language · Computer Science 2024-02-02 Yilun Zhu , Joel Ruben Antony Moniz , Shruti Bhargava , Jiarui Lu , Dhivya Piraviperumal , Site Li , Yuan Zhang , Hong Yu , Bo-Hsiang Tseng

Large Language Models (LLMs) have demonstrated remarkable capabilities in code understanding and generation. However, their effectiveness on non-code Software Engineering (SE) tasks remains underexplored. We present 'Software Engineering…

Software Engineering · Computer Science 2026-02-12 Fabian C. Peña , Steffen Herbold

Large Language Models (LLMs) are highly sensitive to their input contexts, motivating the development of automated context engineering. However, existing methods predominantly treat this as a global search problem, seeking a single context…

Computation and Language · Computer Science 2026-05-18 Jiachen Zhu , Zhuoying Ou , Congmin Zheng , Yuxiang Chen , Zeyu Zheng , Rong Shan , Lingyu Yang , Lionel Z. Wang , Weiwen Liu , Yong Yu , Weinan Zhang , Jianghao Lin

The performance of Large Language Models (LLMs) is fundamentally determined by the contextual information provided during inference. This survey introduces Context Engineering, a formal discipline that transcends simple prompt design to…

LLM-based coding agents have shown strong performance on automated issue resolution benchmarks, yet existing evaluations largely focus on final task success, providing limited insight into how agents retrieve and use code context during…

Machine Learning · Computer Science 2026-02-12 Han Li , Letian Zhu , Bohan Zhang , Rili Feng , Jiaming Wang , Yue Pan , Earl T. Barr , Federica Sarro , Zhaoyang Chu , He Ye

The application of Natural Language Processing (NLP) has achieved a high level of relevance in several areas. In the field of software engineering (SE), NLP applications are based on the classification of similar texts (e.g. software…

Software Engineering · Computer Science 2021-12-02 Eliane Maria De Bortoli Fávero , Dalcimar Casanova

Aligning large language models (LLMs) with human values is essential for their safe deployment and widespread adoption. Current LLM safety benchmarks often focus solely on the refusal of individual problematic queries, which overlooks the…

Computation and Language · Computer Science 2025-02-10 Guangzhi Sun , Xiao Zhan , Shutong Feng , Philip C. Woodland , Jose Such

With the advent of large language models (LLMs) in the artificial intelligence (AI) area, the field of software engineering (SE) has also witnessed a paradigm shift. These models, by leveraging the power of deep learning and massive amounts…

Software Engineering · Computer Science 2024-12-30 Cuiyun Gao , Xing Hu , Shan Gao , Xin Xia , Zhi Jin

The emergence of long-context language models with context windows extending to millions of tokens has created new opportunities for sophisticated code understanding and software development evaluation. We propose LoCoBench, a comprehensive…

The large language model (LLM)-as-judge paradigm has been used to meet the demand for a cheap, reliable, and fast evaluation of model outputs during AI system development and post-deployment monitoring. While judge models -- LLMs finetuned…

Computation and Language · Computer Science 2025-03-21 Austin Xu , Srijan Bansal , Yifei Ming , Semih Yavuz , Shafiq Joty

Although large language models (LLMs) demonstrate impressive performance for many language tasks, most of them can only handle texts a few thousand tokens long, limiting their applications on longer sequence inputs, such as books, reports,…

Computation and Language · Computer Science 2024-06-21 Yushi Bai , Xin Lv , Jiajie Zhang , Hongchang Lyu , Jiankai Tang , Zhidian Huang , Zhengxiao Du , Xiao Liu , Aohan Zeng , Lei Hou , Yuxiao Dong , Jie Tang , Juanzi Li

Recent advances in large language models (LLMs) have significantly impacted data science workflows, giving rise to specialized data science agents designed to automate analytical tasks. Despite rapid adoption, systematic benchmarks…

Artificial Intelligence · Computer Science 2025-08-08 Ram Mohan Rao Kadiyala , Siddhant Gupta , Jebish Purbey , Giulio Martini , Ali Shafique , Suman Debnath , Hamza Farooq

Software testing is a crucial phase in the software life cycle, helping identify potential risks and reduce maintenance costs. With the advancement of Large Language Models (LLMs), researchers have proposed an increasing number of LLM-based…

Software Engineering · Computer Science 2024-09-27 Quanjun Zhang , Ye Shang , Chunrong Fang , Siqi Gu , Jianyi Zhou , Zhenyu Chen

Large language model (LLM) applications such as agents and domain-specific reasoning increasingly rely on context adaptation: modifying inputs with instructions, strategies, or evidence, rather than weight updates. Prior approaches improve…

Vulnerability detection is a critical aspect of software security. Accurate detection is essential to prevent potential security breaches and protect software systems from malicious attacks. Recently, vulnerability detection methods…

Software Engineering · Computer Science 2025-04-24 Yixin Yang , Bowen Xu , Xiang Gao , Hailong Sun
‹ Prev 1 2 3 10 Next ›