Related papers: Executable Agentic Memory for GUI Agent

KG-RAG: Enhancing GUI Agent Decision-Making via Knowledge Graph-Driven Retrieval-Augmented Generation

Despite recent progress, Graphic User Interface (GUI) agents powered by Large Language Models (LLMs) struggle with complex mobile tasks due to limited app-specific knowledge. While UI Transition Graphs (UTGs) offer structured navigation…

Multiagent Systems · Computer Science 2025-09-03 Ziyi Guan , Jason Chun Lok Li , Zhijian Hou , Pingping Zhang , Donglai Xu , Yuzhi Zhao , Mengyang Wu , Jinpeng Chen , Thanh-Toan Nguyen , Pengfei Xian , Wenao Ma , Shengchao Qin , Graziano Chesi , Ngai Wong

EchoTrail-GUI: Building Actionable Memory for GUI Agents via Critic-Guided Self-Exploration

Contemporary GUI agents, while increasingly capable due to advances in Large Vision-Language Models (VLMs), often operate with a critical limitation: they treat each task in isolation, lacking a mechanism to systematically learn from past…

Artificial Intelligence · Computer Science 2026-04-13 Runze Li , Yuwen Zhai , Bo Xu , LiWu Xu , Nian Shi , Wei Zhang , Ran Lin , Liang Wang

AppAgentX: Evolving GUI Agents as Proficient Smartphone Users

Recent advancements in Large Language Models (LLMs) have led to the development of intelligent LLM-based agents capable of interacting with graphical user interfaces (GUIs). These agents demonstrate strong reasoning and adaptability,…

Artificial Intelligence · Computer Science 2025-04-16 Wenjia Jiang , Yangyang Zhuang , Chenxi Song , Xu Yang , Joey Tianyi Zhou , Chi Zhang

GAM: Hierarchical Graph-based Agentic Memory for LLM Agents

To sustain coherent long-term interactions, Large Language Model (LLM) agents must navigate the tension between acquiring new information and retaining prior knowledge. Current unified stream-based memory systems facilitate context updates…

Artificial Intelligence · Computer Science 2026-04-15 Zhaofen Wu , Hanrong Zhang , Fulin Lin , Wujiang Xu , Xinran Xu , Yankai Chen , Henry Peng Zou , Shaowen Chen , Weizhi Zhang , Xue Liu , Philip S. Yu , Hongwei Wang

ActionEngine: From Reactive to Programmatic GUI Agents via State Machine Memory

Existing Graphical User Interface (GUI) agents operate through step-by-step calls to vision language models--taking a screenshot, reasoning about the next action, executing it, then repeating on the new page--resulting in high costs and…

Artificial Intelligence · Computer Science 2026-02-25 Hongbin Zhong , Fazle Faisal , Luis França , Tanakorn Leesatapornwongsa , Adriana Szekeres , Kexin Rong , Suman Nath

EchoGuard: An Agentic Framework with Knowledge-Graph Memory for Detecting Manipulative Communication in Longitudinal Dialogue

Manipulative communication, such as gaslighting, guilt-tripping, and emotional coercion, is often difficult for individuals to recognize. Existing agentic AI systems lack the structured, longitudinal memory to track these subtle,…

Artificial Intelligence · Computer Science 2026-03-06 Ratna Kandala , Niva Manchanda , Akshata Kishore Moharir , Ananth Kandala

AGENTiGraph: An Interactive Knowledge Graph Platform for LLM-based Chatbots Utilizing Private Data

Large Language Models~(LLMs) have demonstrated capabilities across various applications but face challenges such as hallucination, limited reasoning abilities, and factual inconsistencies, especially when tackling complex, domain-specific…

Artificial Intelligence · Computer Science 2024-10-16 Xinjie Zhao , Moritz Blum , Rui Yang , Boming Yang , Luis Márquez Carpintero , Mónica Pina-Navarro , Tony Wang , Xin Li , Huitao Li , Yanran Fu , Rongrong Wang , Juntao Zhang , Irene Li

GASim: A Graph-Accelerated Hybrid Framework for Social Simulation

Large-scale social simulators are essential for studying complex social patterns. Prior work explores hybrid methods to scale up simulations, combining large language models (LLM)-based agents with numerical agent-based models (ABM).…

Artificial Intelligence · Computer Science 2026-05-11 Xuan Zhou , Yanhui Sun , Hantao Yao , Allen He , Yongdong Zhang , Wu Liu

Agent-SAMA: State-Aware Mobile Assistant

Mobile Graphical User Interface (GUI) agents aim to autonomously complete tasks within or across apps based on user instructions. While recent Multimodal Large Language Models (MLLMs) enable these agents to interpret UI screens and perform…

Artificial Intelligence · Computer Science 2025-11-20 Linqiang Guo , Wei Liu , Yi Wen Heng , Tse-Hsun , Chen , Yang Wang

EXG: Self-Evolving Agents with Experience Graphs

Large language model (LLM)-based agents have demonstrated strong capabilities in complex reasoning and problem solving through multi-step interactions, yet most deployed agents remain behaviorally static, with knowledge acquired during…

Artificial Intelligence · Computer Science 2026-05-19 Yuxin Jin , Siyuan Zhang , Hanchen Wang , Lu Qin , Ying Zhang , Wenjie Zhang

Planning Agents on an Ego-Trip: Leveraging Hybrid Ego-Graph Ensembles for Improved Tool Retrieval in Enterprise Task Planning

Effective tool pre-selection via retrieval is essential for AI agents to select from a vast array of tools when identifying and planning actions in the context of complex user queries. Despite its central role in planning, this aspect…

Artificial Intelligence · Computer Science 2025-11-14 Sahil Bansal , Sai Shruthi Sistla , Aarti Arikatala , Sebastian Schreiber

KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph

In this paper, we aim to improve the reasoning ability of large language models (LLMs) over knowledge graphs (KGs) to answer complex questions. Inspired by existing methods that design the interaction strategy between LLMs and KG, we…

Computation and Language · Computer Science 2024-02-20 Jinhao Jiang , Kun Zhou , Wayne Xin Zhao , Yang Song , Chen Zhu , Hengshu Zhu , Ji-Rong Wen

Embodied Task Planning via Graph-Informed Action Generation with Large Language Models

While Large Language Models (LLMs) have demonstrated strong zero-shot reasoning capabilities, their deployment as embodied agents still faces fundamental challenges in long-horizon planning. Unlike open-ended text generation, embodied…

Computation and Language · Computer Science 2026-05-19 Xiang Li , Ning Yan , Masood Mortazavi

What Makes AI Research Replicable? Executable Knowledge Graphs as Scientific Knowledge Representations

Replicating AI research is a crucial yet challenging task for large language model (LLM) agents. Existing approaches often struggle to generate executable code, primarily due to insufficient background knowledge and the limitations of…

Computation and Language · Computer Science 2026-04-21 Yujie Luo , Zhuoyun Yu , Xuehai Wang , Yuqi Zhu , Ningyu Zhang , Lanning Wei , Lun Du , Da Zheng , Huajun Chen

Graph-based Agent Memory: Taxonomy, Techniques, and Applications

Memory emerges as the core module in the Large Language Model (LLM)-based agents for long-horizon complex tasks (e.g., multi-turn dialogue, game playing, scientific discovery), where memory can enable knowledge accumulation, iterative…

Artificial Intelligence · Computer Science 2026-02-06 Chang Yang , Chuang Zhou , Yilin Xiao , Su Dong , Luyao Zhuang , Yujing Zhang , Zhu Wang , Zijin Hong , Zheng Yuan , Zhishang Xiang , Shengyuan Chen , Huachi Zhou , Qinggang Zhang , Ninghao Liu , Jinsong Su , Xinrun Wang , Yi Chang , Xiao Huang

AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

Long-horizon GUI agents are a key step toward real-world deployment, yet effective interaction memory under prevailing paradigms remains under-explored. Replaying full interaction sequences is redundant and amplifies noise, while summaries…

Computer Vision and Pattern Recognition · Computer Science 2026-03-20 Yibo Shi , Jungang Li , Linghao Zhang , Zihao Dongfang , Biao Wu , Sicheng Tao , Yibo Yan , Chenxi Qin , Weiting Liu , Zhixin Lin , Hanqian Li , Yu Huang , Song Dai , Yonghua Hei , Yue Ding , Xiang Li , Shikang Wang , Chengdong Xu , Jingqi Liu , Xueying Ma , Zhiwen Zheng , Xiaofei Zhang , Bincheng Wang , Nichen Yang , Jie Wu , Lihua Tian , Chen Li , Xuming Hu

PAL-UI: Planning with Active Look-back for Vision-Based GUI Agents

Graphical User Interface (GUI) agents powered by Multimodal Large Language Models (MLLMs) promise human-like interaction with software applications, yet long-horizon tasks remain challenging due to memory limitations. Existing approaches…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Zikang Liu , Junyi Li , Wayne Xin Zhao , Dawei Gao , Yaliang Li , Ji-rong Wen

SE-GA: Memory-Augmented Self-Evolution for GUI Agents

Autonomous Graphical User Interface (GUI) agents often struggle with multi-step tasks due to constrained context windows and static policies that fail to adapt to dynamic environments. To address these limitations, this work proposes the…

Machine Learning · Computer Science 2026-05-19 Shilong Jin , Lanjun Wang , Zhuosheng Zhang

Mem-W: Latent Memory-Native GUI Agents

GUI agents are beginning to operate the web, mobile, and desktop as interactive worlds, where successful control depends on carrying forward visual, procedural, and task-level evidence beyond the fleeting present screen. Yet most agents…

Computation and Language · Computer Science 2026-05-12 Guibin Zhang , Yaohui Ling , Fanci Meng , Kun Wang , Shuicheng Yan

MAGNET: Towards Adaptive GUI Agents with Memory-Driven Knowledge Evolution

Mobile GUI agents powered by large foundation models enable autonomous task execution, but frequent updates altering UI appearance and reorganizing workflows cause agents trained on historical data to fail. Despite surface changes,…

Artificial Intelligence · Computer Science 2026-02-03 Libo Sun , Jiwen Zhang , Siyuan Wang , Zhongyu Wei