Related papers: Codebase-Memory: Tree-Sitter-Based Knowledge Graph…

Bridging Code Property Graphs and Language Models for Program Analysis

Large Language Models (LLMs) face critical challenges when analyzing security vulnerabilities in real world codebases: token limits prevent loading entire repositories, code embeddings fail to capture inter procedural data flows, and LLMs…

Cryptography and Security · Computer Science 2026-03-27 Ahmed Lekssays

Separating Intelligence from Execution: A Workflow Engine for the Model Context Protocol

Large Language Model (LLM) agents increasingly interact with external systems through tool-calling protocols such as the Model Context Protocol (MCP). In prevailing architectures, the agent must reason about every tool invocation in every…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-05 Abhinav Singh Parmar

CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Pre-trained on massive amounts of code and text data, large language models (LLMs) have demonstrated remarkable achievements in performing code generation tasks. With additional execution-based feedback, these models can act as agents with…

Computation and Language · Computer Science 2024-11-14 Jierui Li , Hung Le , Yingbo Zhou , Caiming Xiong , Silvio Savarese , Doyen Sahoo

Semantic Tool Discovery for Large Language Models: A Vector-Based Approach to MCP Tool Selection

Large Language Models (LLMs) with tool-calling capabilities have demonstrated remarkable potential in executing complex tasks through external tool integration. The Model Context Protocol (MCP) has emerged as a standardized framework for…

Software Engineering · Computer Science 2026-03-24 Sarat Mudunuri , Jian Wan , Ally Qin , Srinivasan Manoharan

Tree-Planner: Efficient Close-loop Task Planning with Large Language Models

This paper studies close-loop task planning, which refers to the process of generating a sequence of skills (a plan) to accomplish a specific goal while adapting the plan based on real-time observations. Recently, prompting Large Language…

Computation and Language · Computer Science 2024-07-25 Mengkang Hu , Yao Mu , Xinmiao Yu , Mingyu Ding , Shiguang Wu , Wenqi Shao , Qiguang Chen , Bin Wang , Yu Qiao , Ping Luo

Understanding Codebase like a Professional! Human-AI Collaboration for Code Comprehension

Understanding an unfamiliar codebase is an essential task for developers in various scenarios, such as during the onboarding process. Especially when the codebase is large and time is limited, achieving a decent level of comprehension…

Human-Computer Interaction · Computer Science 2026-02-16 Jie Gao , Yue Xue , Xiaofei Xie , SoeMin Thant , Erika Lee , Bowen Xu

Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks

Recent advances in Large Language Models (LLMs) have shown promise in function-level code generation, yet repository-level software engineering tasks remain challenging. Current solutions predominantly rely on proprietary LLM agents, which…

Software Engineering · Computer Science 2025-06-25 Hongyuan Tao , Ying Zhang , Zhenhao Tang , Hongen Peng , Xukun Zhu , Bingchang Liu , Yingguang Yang , Ziyin Zhang , Zhaogui Xu , Haipeng Zhang , Linchao Zhu , Rui Wang , Hang Yu , Jianguo Li , Peng Di

Text2Cypher: Bridging Natural Language and Graph Databases

Knowledge graphs use nodes, relationships, and properties to represent arbitrarily complex data. When stored in a graph database, the Cypher query language enables efficient modeling and querying of knowledge graphs. However, using Cypher…

Machine Learning · Computer Science 2024-12-16 Makbule Gulcin Ozsoy , Leila Messallem , Jon Besga , Gianandrea Minneci

Feedback-Normalized Developer Memory for Reinforcement-Learning Coding Agents: A Safety-Gated MCP Architecture

Large language model (LLM) coding agents increasingly operate over repositories, terminals, tests, and execution traces across long software-engineering episodes. Persistent memory is useful, but static vector stores or generic…

Software Engineering · Computer Science 2026-05-05 Mehmet Iscan

CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Large Language Models (LLMs) excel in stand-alone code tasks like HumanEval and MBPP, but struggle with handling entire code repositories. This challenge has prompted research on enhancing LLM-codebase interaction at a repository scale.…

Software Engineering · Computer Science 2024-08-13 Xiangyan Liu , Bo Lan , Zhiyuan Hu , Yang Liu , Zhicheng Zhang , Fei Wang , Michael Shieh , Wenmeng Zhou

Knowledge Graph Based Repository-Level Code Generation

Recent advancements in Large Language Models (LLMs) have transformed code generation from natural language queries. However, despite their extensive knowledge and ability to produce high-quality code, LLMs often struggle with contextual…

Artificial Intelligence · Computer Science 2025-07-17 Mihir Athale , Vishal Vaddina

MCP-Flow: Facilitating LLM Agents to Master Real-World, Diverse and Scaling MCP Tools

Large Language Models (LLMs) increasingly rely on external tools to perform complex, realistic tasks, yet their ability to utilize the rapidly expanding Model Contextual Protocol (MCP) ecosystem remains limited. Existing MCP research covers…

Artificial Intelligence · Computer Science 2026-04-17 Wenhao Wang , Peizhi Niu , Zhao Xu , Zhaoyu Chen , Jian Du , Yaxin Du , Xianghe Pang , Keduan Huang , Yanfeng Wang , Qiang Yan , Siheng Chen

Monte Carlo Planning with Large Language Model for Text-Based Game Agents

Text-based games provide valuable environments for language-based autonomous agents. However, planning-then-learning paradigms, such as those combining Monte Carlo Tree Search (MCTS) and reinforcement learning (RL), are notably…

Computation and Language · Computer Science 2025-04-24 Zijing Shi , Meng Fang , Ling Chen

COMEX: A Tool for Generating Customized Source Code Representations

Learning effective representations of source code is critical for any Machine Learning for Software Engineering (ML4SE) system. Inspired by natural language processing, large language models (LLMs) like Codex and CodeGen treat code as…

Software Engineering · Computer Science 2023-07-11 Debeshee Das , Noble Saji Mathews , Alex Mathai , Srikanth Tamilselvam , Kranthi Sedamaki , Sridhar Chimalakonda , Atul Kumar

Graph-based Agent Memory: Taxonomy, Techniques, and Applications

Memory emerges as the core module in the Large Language Model (LLM)-based agents for long-horizon complex tasks (e.g., multi-turn dialogue, game playing, scientific discovery), where memory can enable knowledge accumulation, iterative…

Artificial Intelligence · Computer Science 2026-02-06 Chang Yang , Chuang Zhou , Yilin Xiao , Su Dong , Luyao Zhuang , Yujing Zhang , Zhu Wang , Zijin Hong , Zheng Yuan , Zhishang Xiang , Shengyuan Chen , Huachi Zhou , Qinggang Zhang , Ninghao Liu , Jinsong Su , Xinrun Wang , Yi Chang , Xiao Huang

Shared Path: Unraveling Memorization in Multilingual LLMs through Language Similarities

We present the first comprehensive study of Memorization in Multilingual Large Language Models (MLLMs), analyzing 95 languages using models across diverse model scales, architectures, and memorization definitions. As MLLMs are increasingly…

Computation and Language · Computer Science 2026-01-08 Xiaoyu Luo , Yiyi Chen , Johannes Bjerva , Qiongxiu Li

Developer-LLM Conversations: An Empirical Study of Interactions and Generated Code Quality

Large Language Models (LLMs) are becoming integral to modern software development workflows, assisting developers with code generation, API explanation, and iterative problem-solving through natural language conversations. Despite…

Software Engineering · Computer Science 2025-09-15 Suzhen Zhong , Ying Zou , Bram Adams

An Enhanced Large Language Model For Cross Modal Query Understanding System Using DL-KeyBERT Based CAZSSCL-MPGPT

Large Language Models (LLMs) are advanced deep-learning models designed to understand and generate human language. They work together with models that process data like images, enabling cross-modal understanding. However, existing…

Computer Vision and Pattern Recognition · Computer Science 2025-02-25 Shreya Singh

Language Modeling through Long Term Memory Network

Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTM), and Memory Networks which contain memory are popularly used to learn patterns in sequential data. Sequential data has long sequences that hold relationships. RNN can…

Computation and Language · Computer Science 2019-04-22 Anupiya Nugaliyadde , Kok Wai Wong , Ferdous Sohel , Hong Xie

Task Memory Engine (TME): Enhancing State Awareness for Multi-Step LLM Agent Tasks

Large Language Models (LLMs) are increasingly used as autonomous agents for multi-step tasks. However, most existing frameworks fail to maintain a structured understanding of the task state, often relying on linear prompt concatenation or…

Artificial Intelligence · Computer Science 2025-08-26 Ye Ye