Computer Science

GRASP: Plan-Guided Graph Retrieval with Adaptive Fusion and Reranking on Semi-Structured Knowledge Bases

Semi-structured knowledge bases (SKBs) embed textual documents in a typed graph of entities and relations, and underpin applications such as product search, academic paper search, and precision-medicine inquiries. Existing hybrid retrieval…

Information Retrieval · Computer Science 2026-05-29 Yicheng Tao , Yiqun Wang , Xiangchen Song , Xin Luo , Kai Liu , Jie Liu

Automating Low-Risk Code Review at Meta: RADAR, Risk Calibration, and Review Efficiency

AI-assisted coding tools have altered software production. At Meta, significant lines of code per human-landed diff grew by 105.9% year over year and per-developer diff volume rose 51%, with agentic AI responsible for over 80% of that…

Software Engineering · Computer Science 2026-05-29 Chris Adams , Arjun Singh Banga , Parveen Bansal , Souvik Bhattacharya , Rujin Cao , Pedro Canahuati , Nate Cook , Brian Ellis , Prabhakar Goyal , Gurinder Grewal , Tianyu He , Matt Labunka , Alex Manners , David Molnar , Ging Cee Ng , Vishal Parekh , Jiefu Pei , Frederic Sagnes , James Saindon , Will Shackleton , Sid Sidhu , Gursharan Singh , Karthik Chengayan Sridhar , Matt Steiner , Pratibha Udmalpet , Sean Xia , Stacey Yan , Audris Mockus , Peter Rigby , Nachiappan Nagappan

LexPath: A domain-oriented multi-path framework for legal article retrieval

Legal article retrieval is critical for building traceable and reliable legal AI systems, where conclusions must be grounded in specific legal articles. However, existing open-domain retrieval methods rely heavily on surface-level lexical…

Information Retrieval · Computer Science 2026-05-29 Weixuan Liu , Qingfeng Zhuge , Xuyang Chen

No More K-means:Single-Stage Sparse Coding for Efficient Multi-Vector Retrieval

Multi-vector retrieval (MVR) models, exemplified by ColBERT, have established new benchmarks in retrieval accuracy by preserving fine-grained token-level interactions. However, this granularity imposes prohibitive storage and retrieval…

Information Retrieval · Computer Science 2026-05-29 Lixuan Guo , Yifei Wang , Tiansheng Wen , Aosong Feng , Stefanie Jegelka , Chenyu You

EvoRepair: Enhancing Vulnerability Repair Agents Through Experience-Based Self-Evolution

Large Language Models (LLMs) have shown promise for automated vulnerability repair (AVR), but they still face several limitations, including the lack of intra-vulnerability experience accumulation and the lack of cross-vulnerability…

Software Engineering · Computer Science 2026-05-29 Haichuan Hu , Guoqing Xie , Quanjun Zhang , Jiawei Liu , Shengcheng Yu , Chunrong Fang , Zhenyu Chen , Liang Xiao

Projectional Decoding: Towards Semantic-Aware LLM Generation

Large language models (LLMs) are increasingly used to generate software artifacts across many software engineering (SE) tasks, yet ensuring the semantic validity of these artifacts remains a fundamental challenge. Existing constrained…

Software Engineering · Computer Science 2026-05-29 Boqi Chen , José Antonio Hernández López , Aren A. Babikian

REPOT: Recoverable Program-of-Thought via Checkpoint Repair

One-shot Program-of-Thought (PoT) emits a Python program that prints a primitive-action plan; a single invalid action silently invalidates the trajectory. We introduce RePoT (Recoverable PoT): a deterministic verified replay that walks the…

Software Engineering · Computer Science 2026-05-29 Parsa Mazaheri

Uncertainty Quantification for Multimodal Retrieval Augmented Generation

Retrieval Augmented Generation (RAG) improves the question answering capabilities of Large Language Models (LLMs) by incorporating external knowledge and has recently been extended to multimodal settings through Vision-Language Models…

Information Retrieval · Computer Science 2026-05-29 Simon Binz , Heydar Soudani , Faegheh Hasibi

Agora: Toward Autonomous Bug Detection in Production-Level Consensus Protocols with LLM Agents

Consensus protocols form the backbone of distributed systems and blockchains, where implementation bugs can cause data corruption and financial losses. While LLM-based approaches show promise in code analysis, they struggle with deep…

Software Engineering · Computer Science 2026-05-29 Xiang Liu , Sa Song , Zhaowei Zhang , Huiying Lan , Jason Zeng , Ming Wu , Michael Heinrich , Yong Sun , Ceyao Zhang

TagDebt: A Bot to Support Technical Debt Management

Context: Technical debt (TD) is a widely studied metaphor that helps to explain how sub-optimal decisions that can harm software maintainability over time. Although incurring TD is not intrinsically bad, tracking and managing TD are crucial…

Software Engineering · Computer Science 2026-05-29 João Paulo Biazotto , Daniel Feitosa , Paris Avgeriou , Elisa Yumi Nakagawa

Inferring Code Correctness from Specification

Large language models (LLMs) have become integral to modern software development, enabling automated code generation at scale. However, validating the correctness of LLM-generated code remains a critical and largely unsolved challenge.…

Software Engineering · Computer Science 2026-05-29 Tambon Florian , Papadakis Mike

Rec-Distill: An Industrial Distillation Pipeline for Large-Scale Recommendation Models

Large recommendation models have demonstrated substantial potential gains under scaling laws, yet these gains are difficult to realize in industrial recommendation systems because real-world deployment requires lightweight models with…

Information Retrieval · Computer Science 2026-05-29 Haoran Ding , Wenlin Zhao , Yuchen Jiang , Juren Li , Jie Zhu , Xinchun Li , Yishujie Zhao , Yi Zhang , Ao Qiao , Jianhui Dong , Cheng Chen , Ziyan Gong , Deping Xie , Peng Xu , Zikai Wang , Yuwei Wang , Huizhi Yang , Zhe Chen , Yuchao Zheng

GUITestScape: Towards Open-set Evaluation on Exploratory GUI Testing

Exploratory GUI testing is a particularly demanding setting for MLLM agents: without predefined test scripts, an agent must autonomously navigate an application and discover defects through its own interaction. However, current evaluation…

Software Engineering · Computer Science 2026-05-29 Xiaoyi Chen , Yifei Gao , Yang Xu , Xingxing Song , Yi Zhang , Jitao Sang

FLASH-MAXSIM: IO-Aware Fused Kernels for Late-Interaction Scoring

Late-interaction retrieval (ColBERT, ColPali) scores a query against a document with the MaxSim operator: for every query token, the maximum similarity over the document tokens, summed over query tokens. The standard implementation…

Information Retrieval · Computer Science 2026-05-29 Roi Pony , Adi Raz Goldfarb , Idan Friedman , Daniel Ezer , Udi Barzelay

CODEFUSE-DEBENCH: An Empirical Study on Readability, Recompilability, and Functionality

Binary decompilation aims to recover binaries into high-level source code, but existing evaluations mainly rely on syntactic similarity or single-axis readability metrics, which fail to capture practical reusability. We propose a…

Software Engineering · Computer Science 2026-05-29 Puzhuo Liu , Yuhan Huang , Jianlei Chi , Peng Di , Yu Jiang

Usability Analysis of Configurator User Interfaces with Multimodal Large Language Models

Configuration is a key technology for tailoring complex software systems, services, and products. A successful application of configurators not only depends on technical correctness, performance, and domain modeling but also on their…

Software Engineering · Computer Science 2026-05-29 Sebastian Lubos , Alexander Felfernig , Damian Garber , Adnan Kraljić , Tarik Kraljić , Viet-Man Le , Thi Ngoc Trang Tran , Gerhard Leitner , Julian Schwazer , Doris Suppan , Reinhard Willfort , Ivan Dukic , Jeremias Fuchs , Manuel Henrich

How Coding Agents Fail Their Users: A Large-Scale Analysis of Developer-Agent Misalignment in 20,574 Real-World Sessions

AI coding agents increasingly act directly within software environments, yet existing analyses of their failures rely on benchmark trajectories that miss how developers actually experience misalignment. We present an observational study of…

Software Engineering · Computer Science 2026-05-29 Ningzhi Tang , Chaoran Chen , Gelei Xu , Yiyu Shi , Yu Huang , Collin McMillan , Tao Dong , Toby Jia-Jun Li

Offloading Score: Measuring AI Reliance Through Counterfactual Workflows

AI tools are increasingly integrated into real-world workflows. However, existing measures of reliance on these tools focus on AI output adoption or on self-reported indicators, rather than how task effort is distributed between users and…

Software Engineering · Computer Science 2026-05-29 Vishakh Padmakumar , Lujain Ibrahim , Zora Zhiruo Wang , Jennifer Wang , Q. Vera Liao , Diyi Yang

Latent Terms: Dense Retrievers Contain Trivially Extractable BM25-ready Zipfian Vocabularies

We propose Latent Terms, a method revealing that models trained for dense retrieval, whether single- or multi-vector, learn representations that can trivially be decomposed into retrieval-ready sparse features. When trained on frozen…

Information Retrieval · Computer Science 2026-05-29 Benjamin Clavié , Sean Lee , Aamir Shakir , Makoto P. Kato

On the Road to Personalized Code Intelligence: Portraiting and Assisting Developers Based on Their In-IDE Behaviors

With the advent of large language models, research in automated software engineering has increasingly focused on leveraging these models to achieve a deeper semantic understanding of code or to engineer sophisticated agent-based processes.…

Software Engineering · Computer Science 2026-05-29 Yuhong Liu , Yunhe Su , Zhipeng Peng , Zhiwen Luo , Lin Shi , Zhi Jin , Li Zhang