Related papers: R2Code: A Self-Reflective LLM Framework for Requir…

ReqToCode: Embedding Requirements Traceability as a Structural Property of the Codebase

Requirements traceability in safety-critical software development remains largely dependent on external documentation maintained separately from the systems it describes. This separation introduces structural fragility: traces degrade…

Software Engineering · Computer Science 2026-03-17 Thorsten Schlathölter

Evaluating the Use of LLMs for Documentation to Code Traceability

Large Language Models (LLMs) offer new potential for automating documentation-to-code traceability, yet their capabilities remain underexplored. We present a comprehensive evaluation of LLMs (Claude 3.5 Sonnet, GPT-4o, and o3-mini) in…

Software Engineering · Computer Science 2025-08-08 Ebube Alor , SayedHassan Khatoonabadi , Emad Shihab

Synergistic Enhancement of Requirement-to-Code Traceability: A Framework Combining Large Language Model based Data Augmentation and an Advanced Encoder

Automated requirement-to-code traceability link recovery, essential for industrial system quality and safety, is critically hindered by the scarcity of labeled data. To address this bottleneck, this paper proposes and validates a…

Software Engineering · Computer Science 2025-10-21 Jianzhang Zhang , Jialong Zhou , Nan Niu , Jinping Hua , Chuang Liu

Enhancing Automated Software Traceability by Transfer Learning from Open-World Data

Software requirements traceability is a critical component of the software engineering process, enabling activities such as requirements validation, compliance verification, and safety assurance. However, the cost and effort of manually…

Software Engineering · Computer Science 2022-07-05 Jinfeng Lin , Amrit Poudel , Wenhao Yu , Qingkai Zeng , Meng Jiang , Jane Cleland-Huang

TraceCoder: A Trace-Driven Multi-Agent Framework for Automated Debugging of LLM-Generated Code

Large Language Models (LLMs) often generate code with subtle but critical bugs, especially for complex tasks. Existing automated repair methods typically rely on superficial pass/fail signals, offering limited visibility into program…

Software Engineering · Computer Science 2026-02-09 Jiangping Huang , Wenguang Ye , Weisong Sun , Jian Zhang , Mingyue Zhang , Yang Liu

Semantic Source Code Segmentation using Small and Large Language Models

Source code segmentation, dividing code into functionally coherent segments, is crucial for knowledge retrieval and maintenance in software development. While enabling efficient navigation and comprehension of large codebases, manual and…

Software Engineering · Computer Science 2025-07-15 Abdelhalim Dahou , Ansgar Scherp , Sebastian Kurten , Brigitte Mathiak , Madhu Chauhan

LoRACode: LoRA Adapters for Code Embeddings

Code embeddings are essential for semantic code search; however, current approaches often struggle to capture the precise syntactic and contextual nuances inherent in code. Open-source models such as CodeBERT and UniXcoder exhibit…

Machine Learning · Computer Science 2025-06-03 Saumya Chaturvedi , Aman Chadha , Laurent Bindschaedler

iCoRe: An Iterative Correlation-Aware Retriever for Bug Reproduction Test Generation

Automatically generating bug reproduction tests (BRT) from issue descriptions is crucial for software maintenance. LLM-based approaches have shown great potential for this task. Their effectiveness heavily relies on retrieving high-quality…

Software Engineering · Computer Science 2026-04-22 Junyi Wang , Jialun Cao , Zhongxin Liu

TraceLLM: Leveraging Large Language Models with Prompt Engineering for Enhanced Requirements Traceability

Requirements traceability, the process of establishing and maintaining relationships between requirements and various software development artifacts, is paramount for ensuring system integrity and fulfilling requirements throughout the…

Software Engineering · Computer Science 2026-05-25 Nouf Alturayeif , Irfan Ahmad , Jameleddine Hassine

ReCode: Improving LLM-based Code Repair with Fine-Grained Retrieval-Augmented Generation

Recent advances in large language models (LLMs) have demonstrated impressive capabilities in code-related tasks, such as code generation and automated program repair. Despite their promising performance, most existing approaches for code…

Software Engineering · Computer Science 2025-09-03 Yicong Zhao , Shisong Chen , Jiacheng Zhang , Zhixu Li

AlignCoder: Aligning Retrieval with Target Intent for Repository-Level Code Completion

Repository-level code completion remains a challenging task for existing code large language models (code LLMs) due to their limited understanding of repository-specific context and domain knowledge. While retrieval-augmented generation…

Software Engineering · Computer Science 2026-01-28 Tianyue Jiang , Yanli Wang , Yanlin Wang , Daya Guo , Ensheng Shi , Yuchi Ma , Jiachi Chen , Zibin Zheng

R2ComSync: Improving Code-Comment Synchronization with In-Context Learning and Reranking

Code-Comment Synchronization (CCS) aims to synchronize the comments with code changes in an automated fashion, thereby significantly reducing the workload of developers during software maintenance and evolution. While previous studies have…

Software Engineering · Computer Science 2025-10-27 Zhen Yang , Hongyi Lin , Xiao Yu , Jacky Wai Keung , Shuo Liu , Pak Yuen Patrick Chan , Yicheng Sun , Fengji Zhang

Retrieval-augmented code completion for local projects using large language models

The use of large language models (LLMs) is becoming increasingly widespread among software developers. However, privacy and computational requirements are problematic with commercial solutions and the use of LLMs. In this work, we focus on…

Software Engineering · Computer Science 2025-06-17 Marko Hostnik , Marko Robnik-Šikonja

ReCode: Updating Code API Knowledge with Reinforcement Learning

Large Language Models (LLMs) exhibit remarkable code generation capabilities but falter when adapting to frequent updates in external library APIs. This critical limitation, stemming from reliance on outdated API knowledge from their…

Computation and Language · Computer Science 2025-11-25 Haoze Wu , Yunzhi Yao , Wenhao Yu , Ningyu Zhang

SpecMap: Hierarchical LLM Agent for Datasheet-to-Code Traceability Link Recovery in Systems Engineering

Establishing precise traceability between embedded systems datasheets and their corresponding code implementations remains a fundamental challenge in systems engineering, particularly for low-level software where manual mapping between…

Software Engineering · Computer Science 2026-01-21 Vedant Nipane , Pulkit Agrawal , Amit Singh

RLCoder: Reinforcement Learning for Repository-Level Code Completion

Repository-level code completion aims to generate code for unfinished code snippets within the context of a specified repository. Existing approaches mainly rely on retrieval-augmented generation strategies due to limitations in input…

Software Engineering · Computer Science 2024-07-31 Yanlin Wang , Yanli Wang , Daya Guo , Jiachi Chen , Ruikai Zhang , Yuchi Ma , Zibin Zheng

REINFOREST: Reinforcing Semantic Code Similarity for Cross-Lingual Code Search Models

This paper introduces a novel code-to-code search technique that enhances the performance of Large Language Models (LLMs) by including both static and dynamic features as well as utilizing both similar and dissimilar examples during…

Software Engineering · Computer Science 2024-04-17 Anthony Saieva , Saikat Chakraborty , Gail Kaiser

Llama2Vec: Unsupervised Adaptation of Large Language Models for Dense Retrieval

Dense retrieval calls for discriminative embeddings to represent the semantic relationship between query and document. It may benefit from the using of large language models (LLMs), given LLMs' strong capability on semantic understanding.…

Computation and Language · Computer Science 2025-11-25 Zheng Liu , Chaofan Li , Shitao Xiao , Yingxia Shao , Defu Lian

An Effective Approach to Embedding Source Code by Combining Large Language and Sentence Embedding Models

The advent of large language models (LLMs) has significantly advanced artificial intelligence (AI) in software engineering (SE), with source code embeddings playing a crucial role in tasks such as source code clone detection and source code…

Software Engineering · Computer Science 2025-06-04 Zixiang Xian , Chenhui Cui , Rubing Huang , Chunrong Fang , Zhenyu Chen

Traceability Support for Multi-Lingual Software Projects

Software traceability establishes associations between diverse software artifacts such as requirements, design, code, and test cases. Due to the non-trivial costs of manually creating and maintaining links, many researchers have proposed…

Software Engineering · Computer Science 2020-07-01 Yalin Liu , Jinfeng Lin , Jane Cleland-Huang