Related papers: Towards Exception Safety Code Generation with Inte…

Seeker: Towards Exception Safety Code Generation with Intermediate Language Agents Framework

In real world software development, improper or missing exception handling can severely impact the robustness and reliability of code. Exception handling mechanisms require developers to detect, capture, and manage exceptions according to…

Computation and Language · Computer Science 2024-12-17 Xuanming Zhang , Yuxuan Chen , Yiming Zheng , Zhexin Zhang , Yuan Yuan , Minlie Huang

RA-Gen: A Controllable Code Generation Framework Using ReAct for Multi-Agent Task Execution

Code generation models based on large language models (LLMs) have gained wide adoption, but challenges remain in ensuring safety, accuracy, and controllability, especially for complex tasks. Existing methods often lack dynamic integration…

Software Engineering · Computer Science 2025-10-13 Aofan Liu , Haoxuan Li , Bin Wang , Ao Yang , Hui Li

RESCUE: Retrieval Augmented Secure Code Generation

Despite recent advances, Large Language Models (LLMs) still generate vulnerable code. Retrieval-Augmented Generation (RAG) has the potential to enhance LLMs for secure code generation by incorporating external security knowledge. However,…

Cryptography and Security · Computer Science 2026-03-17 Jiahao Shi , Tianyi Zhang

Parser-Free Querying of Security Logs

Security analysts routinely query system logs to detect threats and investigate incidents, but each log source uses its own semi-structured format: logs are cheap to produce, but expensive to use. The standard approach, building per-source…

Cryptography and Security · Computer Science 2026-05-22 Evan Luo , Julien Piet , David Wagner

CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Large Language Models (LLMs) have made significant strides in code generation and problem solving. Current approaches employ external tool-based iterative debuggers that use compiler or other tool-based runtime feedback to refine coarse…

Computation and Language · Computer Science 2026-04-28 Md. Ashraful Islam , Mohammed Eunus Ali , Md Rizwan Parvez

CIBER: A Comprehensive Benchmark for Security Evaluation of Code Interpreter Agents

LLM-based code interpreter agents are increasingly deployed in critical workflows, yet their robustness against risks introduced by their code execution capabilities remains underexplored. Existing benchmarks are limited to static datasets…

Cryptography and Security · Computer Science 2026-02-24 Lei Ba , Qinbin Li , Songze Li

Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, and Resolution in Security Operations

Security Operations Centers (SOCs) face mounting operational challenges. These challenges come from increasing threat volumes, heterogeneous SIEM platforms, and time-consuming manual triage workflows. We present an end-to-end threat…

Cryptography and Security · Computer Science 2026-05-01 Md Hasan Saju , Akramul Azim

From Text to Pixel: Advancing Long-Context Understanding in MLLMs

The rapid progress in Multimodal Large Language Models (MLLMs) has significantly advanced their ability to process and understand complex visual and textual information. However, the integration of multiple images and extensive textual…

Computer Vision and Pattern Recognition · Computer Science 2024-08-27 Yujie Lu , Xiujun Li , Tsu-Jui Fu , Miguel Eckstein , William Yang Wang

RealSec-bench: A Benchmark for Evaluating Secure Code Generation in Real-World Repositories

Large Language Models (LLMs) have demonstrated remarkable capabilities in code generation, but their proficiency in producing secure code remains a critical, under-explored area. Existing benchmarks often fall short by relying on synthetic…

Cryptography and Security · Computer Science 2026-02-02 Yanlin Wang , Ziyao Zhang , Chong Wang , Xinyi Xu , Mingwei Liu , Yong Wang , Jiachi Chen , Zibin Zheng

Extracting Events Like Code: A Multi-Agent Programming Framework for Zero-Shot Event Extraction

Zero-shot event extraction (ZSEE) remains a significant challenge for large language models (LLMs) due to the need for complex reasoning and domain-specific understanding. Direct prompting often yields incomplete or structurally invalid…

Computation and Language · Computer Science 2025-11-18 Quanjiang Guo , Sijie Wang , Jinchuan Zhang , Ben Zhang , Zhao Kang , Ling Tian , Ke Yan

Executing as You Generate: Hiding Execution Latency in LLM Code Generation

Current LLM-based coding agents follow a serial execution paradigm: the model first generates the complete code, then invokes an interpreter to execute it. This sequential workflow leaves the executor idle during generation and the…

Programming Languages · Computer Science 2026-04-02 Zhensu Sun , Zhihao Lin , Zhi Chen , Chengran Yang , Mingyi Zhou , Li Li , David Lo

AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

The advancement of natural language processing (NLP) has been significantly boosted by the development of transformer-based large language models (LLMs). These models have revolutionized NLP tasks, particularly in code generation, aiding…

Computation and Language · Computer Science 2024-05-27 Dong Huang , Jie M. Zhang , Michael Luck , Qingwen Bu , Yuhao Qing , Heming Cui

Give LLMs a Security Course: Securing Retrieval-Augmented Code Generation via Knowledge Injection

Retrieval-Augmented Code Generation (RACG) leverages external knowledge to enhance Large Language Models (LLMs) in code synthesis, improving the functional correctness of the generated code. However, existing RACG systems largely overlook…

Cryptography and Security · Computer Science 2025-04-24 Bo Lin , Shangwen Wang , Yihao Qin , Liqian Chen , Xiaoguang Mao

Beyond Language Barriers: Multi-Agent Coordination for Multi-Language Code Generation

Producing high-quality code across multiple programming languages is increasingly important as today's software systems are built on heterogeneous stacks. Large language models (LLMs) have advanced the state of automated programming, yet…

Software Engineering · Computer Science 2025-09-25 Micheline Bénédicte Moumoula , Serge Lionel Nikiema , Albérick Euraste Djire , Abdoul Kader Kabore , Jacques Klein , Tegawendé F. Bissyande

InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking

Recent agentic search systems have made substantial progress by emphasising deep, multi-step reasoning. However, this focus often overlooks the challenges of wide-scale information synthesis, where agents must aggregate large volumes of…

Artificial Intelligence · Computer Science 2026-04-06 Ka Yiu Lee , Yuxuan Huang , Zhiyuan He , Huichi Zhou , Weilin Luo , Kun Shao , Meng Fang , Jun Wang

Code Researcher: Deep Research Agent for Large Systems Code and Commit History

Large Language Model (LLM)-based coding agents have shown promising results on coding benchmarks, but their effectiveness on systems code remains underexplored. Due to the size and complexities of systems code, making changes to a systems…

Software Engineering · Computer Science 2026-05-21 Ramneet Singh , Sathvik Joel , Abhav Mehrotra , Nalin Wadhwa , Ramakrishna B Bairi , Aditya Kanade , Nagarajan Natarajan

An Empirical Evaluation of LLM-Based Approaches for Code Vulnerability Detection: RAG, SFT, and Dual-Agent Systems

The rapid advancement of Large Language Models (LLMs) presents new opportunities for automated software vulnerability detection, a crucial task in securing modern codebases. This paper presents a comparative study on the effectiveness of…

Software Engineering · Computer Science 2026-01-05 Md Hasan Saju , Maher Muhtadi , Akramul Azim

CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions

The evaluation of Large Language Models (LLMs) for code generation relies heavily on the quality and robustness of test cases. However, existing benchmarks often lack coverage for subtle corner cases, allowing incorrect solutions to pass.…

Software Engineering · Computer Science 2026-02-25 Jingwei Shi , Xinxiang Yin , Jing Huang , Jinman Zhao , Shengyu Tao

Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks

Large Language Models (LLMs) have shown remarkable capabilities in code generation tasks, yet they face significant limitations in handling complex, long-context programming challenges and demonstrating complex compositional reasoning…

Artificial Intelligence · Computer Science 2025-01-14 Amr Almorsi , Mohanned Ahmed , Walid Gomaa

ADSeeker: A Knowledge-Grounded Reasoning Framework for Industry Anomaly Detection and Reasoning

Automatic vision inspection holds significant importance in industry inspection. While multimodal large language models (MLLMs) exhibit strong language understanding capabilities and hold promise for this task, their performance remains…

Information Retrieval · Computer Science 2026-04-06 Kai Zhang , Zekai Zhang , Xihe Sun , Anpeng Wang , Jingmeng Nie , Qinghui Chen , Han Hao , Jianyuan Guo , Jinglin Zhang