English
Related papers

Related papers: Relative Positioning Based Code Chunking Method Fo…

200 papers

Large Language Models (LLMs) have demonstrated impressive capabilities in code completion tasks, where they assist developers by predicting and generating new code in real-time. However, existing LLM-based code completion systems primarily…

Software Engineering · Computer Science 2024-12-12 Zhanming Guan , Junlin Liu , Jierui Liu , Chao Peng , Dexin Liu , Ningyuan Sun , Bo Jiang , Wenchao Li , Jie Liu , Hang Zhu

Context plays an important role in the quality of code completion, as Large Language Models (LLMs) require sufficient and relevant information to assist developers in code generation tasks. However, composing a relevant context for code…

Software Engineering · Computer Science 2025-10-09 Uswat Yusuf , Genevieve Caumartin , Diego Elias Costa

Recent advancements in code-fluent Large Language Models (LLMs) enabled the research on repository-level code editing. In such tasks, the model navigates and modifies the entire codebase of a project according to request. Hence, such tasks…

Software Engineering · Computer Science 2024-06-10 Alexander Kovrigin , Aleksandra Eliseeva , Yaroslav Zharov , Timofey Bryksin

The performance of repository-level code completion depends upon the effective leverage of both general and repository-specific knowledge. Despite the impressive capability of code LLMs in general code completion tasks, they often exhibit…

Software Engineering · Computer Science 2024-09-16 Wei Liu , Ailun Yu , Daoguang Zan , Bo Shen , Wei Zhang , Haiyan Zhao , Zhi Jin , Qianxiang Wang

Some recently developed code large language models (Code LLMs) have been pre-trained on repository-level code data (Repo-Code LLMs), enabling these models to recognize repository structures and utilize cross-file information for code…

Computation and Language · Computer Science 2024-06-28 Lei Zhang , Yunshui Li , Jiaming Li , Xiaobo Xia , Jiaxi Yang , Run Luo , Minzheng Wang , Longze Chen , Junhao Liu , Min Yang

Code completion, which aims to predict the following code token(s) according to the code context, can improve the productivity of software development. Recent work has proved that statistical language modeling with transformers can greatly…

Software Engineering · Computer Science 2022-03-16 Shuai Lu , Nan Duan , Hojae Han , Daya Guo , Seung-won Hwang , Alexey Svyatkovskiy

Many use cases require retrieving smaller portions of text, and dense vector-based retrieval systems often perform better with shorter text segments, as the semantics are less likely to be over-compressed in the embeddings. Consequently,…

Computation and Language · Computer Science 2025-07-08 Michael Günther , Isabelle Mohr , Daniel James Williams , Bo Wang , Han Xiao

Repository-level code completion remains a challenging task for existing code large language models (code LLMs) due to their limited understanding of repository-specific context and domain knowledge. While retrieval-augmented generation…

Software Engineering · Computer Science 2026-01-28 Tianyue Jiang , Yanli Wang , Yanlin Wang , Daya Guo , Ensheng Shi , Yuchi Ma , Jiachi Chen , Zibin Zheng

The rapid advancement of large language models (LLMs) has led to a significant increase in automated tools in the software engineering, capable of performing various code-related tasks such as code generation, completion, and translation.…

Software Engineering · Computer Science 2026-02-26 Madhusudan Ghosh , Rishabh Gupta

The use of large language models (LLMs) is becoming increasingly widespread among software developers. However, privacy and computational requirements are problematic with commercial solutions and the use of LLMs. In this work, we focus on…

Software Engineering · Computer Science 2025-06-17 Marko Hostnik , Marko Robnik-Šikonja

Large Language Models (LLMs) have achieved remarkable success in code completion, as evidenced by their essential roles in developing code assistant services such as Copilot. Being trained on in-file contexts, current LLMs are quite…

Software Engineering · Computer Science 2024-02-20 Yichen Li , Yun Peng , Yintong Huo , Michael R. Lyu

Document chunking is a critical preprocessing step in dense retrieval systems, yet the design space of chunking strategies remains poorly understood. Recent research has proposed several concurrent approaches, including LLM-guided methods…

Information Retrieval · Computer Science 2026-02-20 Yongjie Zhou , Shuai Wang , Bevan Koopman , Guido Zuccon

Repository-level code completion automatically predicts the unfinished code based on the broader information from the repository. Recent strides in Code Large Language Models (code LLMs) have spurred the development of repository-level code…

Computation and Language · Computer Science 2025-09-22 Sheng Zhang , Yifan Ding , Shuquan Lian , Shun Song , Hui Li

Retrieval-augmented generation (RAG) has become a transformative approach for enhancing large language models (LLMs) by grounding their outputs in external knowledge sources. Yet, a critical question persists: how can vast volumes of…

Information Retrieval · Computer Science 2025-04-29 Carlo Merola , Jaspinder Singh

Code completion plays a prominent role in modern integrated development environments (IDEs). Machine learning has become ubiquitous in analogous natural language writing and search software, surfacing more relevant autocompletions and…

Software Engineering · Computer Science 2020-04-14 Gareth Ari Aye , Gail E. Kaiser

In retrieval-augmented systems, context ranking techniques are commonly employed to reorder the retrieved contexts based on their relevance to a user query. A standard approach is to measure this relevance through the similarity between…

Information Retrieval · Computer Science 2024-10-22 Weichao Zhou , Jiaxin Zhang , Hilaf Hasson , Anu Singh , Wenchao Li

The success of language models in code assistance has spurred the proposal of repository-level code completion as a means to enhance prediction accuracy, utilizing the context from the entire codebase. However, this amplified context can…

Software Engineering · Computer Science 2024-02-26 Ming Liang , Xiaoheng Xie , Gehao Zhang , Xunjin Zheng , Peng Di , wei jiang , Hongwei Chen , Chengpeng Wang , Gang Fan

A code completion system suggests future code elements to developers given a partially-complete code snippet. Code completion is one of the most useful features in Integrated Development Environments (IDEs). Currently, most code completion…

Software Engineering · Computer Science 2020-09-21 Wenhan Wang , Sijie Shen , Ge Li , Zhi Jin

While Retrieval-Augmented Generation (RAG) has emerged as a promising paradigm for boosting large language models (LLMs) in knowledge-intensive tasks, it often overlooks the crucial aspect of text chunking within its workflow. This paper…

Computation and Language · Computer Science 2025-05-22 Jihao Zhao , Zhiyuan Ji , Yuchen Feng , Pengnian Qi , Simin Niu , Bo Tang , Feiyu Xiong , Zhiyu Li

Retrieving the right level of context for a given query is a perennial challenge in information retrieval - too large a chunk dilutes semantic specificity, while chunks that are too small lack broader context. This paper introduces the…

Information Retrieval · Computer Science 2025-03-05 Ashish Singh , Priti Mohapatra
‹ Prev 1 2 3 10 Next ›