Boxuan Zhang — Scifaro

Approximating full conformal prediction: distribution free guarantees via the tournament correction

Conformal prediction is a framework for providing prediction intervals with distribution-free validity, guaranteeing predictive coverage for data drawn from any distribution. Its two main variants are full conformal prediction and split…

Methodology · Statistics 2026-05-29 Aabesh Bhattacharyya , Boxuan Zhang , Rina Foygel Barber

TRACES: Proactive Safety Auditing for Multi-Turn LLM Agents via Trajectory-State Modeling

LLM agents increasingly operate through multi-turn tool use and environment interaction, where safety risks often emerge from intermediate steps long before they surface in the final outcome. Reactive auditing is therefore insufficient:…

Computation and Language · Computer Science 2026-05-28 Jiaqian Li , Yanshu Li , Boxuan Zhang , Ruixiang Tang , Kuan-Hao Huang

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Long-term agent memory is increasingly multimodal, yet existing evaluations rarely test whether agents preserve the visual evidence needed for later reasoning. In prior work, many visually grounded questions can be answered using only…

Computer Vision and Pattern Recognition · Computer Science 2026-05-15 Minghao Guo , Qingyue Jiao , Zeru Shi , Yihao Quan , Boxuan Zhang , Danrui Li , Liwei Che , Wujiang Xu , Shilong Liu , Zirui Liu , Mubbasir Kapadia , Vladimir Pavlovic , Jiang Liu , Mengdi Wang , Yiyu Shi , Dimitris N. Metaxas , Ruixiang Tang

AgentForesight: Online Auditing for Early Failure Prediction in Multi-Agent Systems

LLM-based multi-agent systems are increasingly deployed on long-horizon tasks, but a single decisive error is often accepted by downstream agents and cascades into trajectory-level failure. Existing work frames this as \emph{post-hoc…

Computation and Language · Computer Science 2026-05-15 Boxuan Zhang , Jianing Zhu , Zeru Shi , Dongfang Liu , Ruixiang Tang

Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts

Recent generative models can produce images that appear highly realistic, raising challenges in distinguishing real and AI-generated images. Yet existing detectors based on pre-trained feature extractors tend to over-rely on global…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Boxuan Zhang , Jianing Zhu , Qifan Wang , Jiang Liu , Ruixiang Tang

OptiVerse: A Comprehensive Benchmark towards Optimization Problem Solving

While Large Language Models (LLMs) demonstrate remarkable reasoning, complex optimization tasks remain challenging, requiring domain knowledge and robust implementation. However, existing benchmarks focus narrowly on Mathematical…

Computation and Language · Computer Science 2026-04-24 Xinyu Zhang , Boxuan Zhang , Yuchen Wan , Lingling Zhang , YiXing Yao , Bifan Wei , Yaqiang Wu , Jun Liu

Dual-Cluster Memory Agent: Resolving Multi-Paradigm Ambiguity in Optimization Problem Solving

Large Language Models (LLMs) often struggle with structural ambiguity in optimization problems, where a single problem admits multiple related but conflicting modeling paradigms, hindering effective solution generation. To address this, we…

Computation and Language · Computer Science 2026-04-23 Xinyu Zhang , Yuchen Wan , Boxuan Zhang , Zesheng Yang , Lingling Zhang , Bifan Wei , Jun Liu

ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

GUI agents drive applications through their visual interfaces instead of programmatic APIs, interacting with arbitrary software via taps, swipes, and keystrokes, reaching a long tail of applications that CLI-based agents cannot. Yet…

Machine Learning · Computer Science 2026-04-14 Fei Tang , Zhiqiong Lu , Boxuan Zhang , Weiming Lu , Jun Xiao , Yueting Zhuang , Yongliang Shen

Dive into the Agent Matrix: A Realistic Evaluation of Self-Replication Risk in LLM Agents

The prevalent deployment of Large Language Model agents such as OpenClaw unlocks potential in real-world applications, while amplifying safety concerns. Among these concerns, the self-replication risk of LLM agents driven by objective…

Artificial Intelligence · Computer Science 2026-04-02 Boxuan Zhang , Yi Yu , Jiaxuan Guo , Jing Shao

Shifting Uncertainty to Critical Moments: Towards Reliable Uncertainty Quantification for VLA Model

Vision-Language-Action (VLA) models enable general-purpose robotic policies by mapping visual observations and language instructions to low-level actions, but they often lack reliable introspection. A common practice is to compute a…

Robotics · Computer Science 2026-03-20 Yanchuan Tang , Taowen Wang , Yuefei Chen , Boxuan Zhang , Qiang Guan , Ruixiang Tang

Differentiable Geometric Indexing for End-to-End Generative Retrieval

Generative Retrieval (GR) has emerged as a promising paradigm to unify indexing and search within a single probabilistic framework. However, existing approaches suffer from two intrinsic conflicts: (1) an Optimization Blockage, where the…

Information Retrieval · Computer Science 2026-03-12 Xujing Wang , Yufeng Chen , Boxuan Zhang , Jie Zhao , Chao Wei , Cai Xu , Ziyu Guan , Wei Zhao , Weiru Zhang , Xiaoyi Zeng

Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5

To understand and identify the unprecedented risks posed by rapidly advancing artificial intelligence (AI) models, Frontier AI Risk Management Framework in Practice presents a comprehensive assessment of their frontier risks. As Large…

Artificial Intelligence · Computer Science 2026-02-17 Dongrui Liu , Yi Yu , Jie Zhang , Guanxu Chen , Qihao Lin , Hanxi Zhu , Lige Huang , Yijin Zhou , Peng Wang , Shuai Shao , Boxuan Zhang , Zicheng Liu , Jingwei Sun , Yu Li , Yuejin Xie , Jiaxuan Guo , Jia Xu , Chaochao Lu , Bowen Zhou , Xia Hu , Jing Shao

Data Augmentation for High-Fidelity Generation of CAR-T/NK Immunological Synapse Images

Chimeric antigen receptor (CAR)-T and NK cell immunotherapies have transformed cancer treatment, and recent studies suggest that the quality of the CAR-T/NK cell immunological synapse (IS) may serve as a functional biomarker for predicting…

Computer Vision and Pattern Recognition · Computer Science 2026-02-04 Xiang Zhang , Boxuan Zhang , Alireza Naghizadeh , Mohab Mohamed , Dongfang Liu , Ruixiang Tang , Dimitris Metaxas , Dongfang Liu

Mixture-of-World Models: Scaling Multi-Task Reinforcement Learning with Modular Latent Dynamics

A fundamental challenge in multi-task reinforcement learning (MTRL) is achieving sample efficiency in visual domains where tasks exhibit substantial heterogeneity in both observations and dynamics. Model-based reinforcement learning offers…

Machine Learning · Computer Science 2026-02-03 Boxuan Zhang , Weipu Zhang , Zhaohan Feng , Wei Xiao , Jian Sun , Jie Chen , Gang Wang

Ternary Spiking Neural Networks Enhanced by Complemented Neurons and Membrane Potential Aggregation

Spiking Neural Networks (SNNs) are promising energy-efficient models and powerful framworks of modeling neuron dynamics. However, existing binary spiking neurons exhibit limited biological plausibilities and low information capacity.…

Neural and Evolutionary Computing · Computer Science 2026-01-23 Boxuan Zhang , Jiaxin Wang , Zhen Xu , Kuan Tao

Temporal Regularization Training: Unleashing the Potential of Spiking Neural Networks

Spiking Neural Networks (SNNs) have received widespread attention due to their event-driven and low-power characteristics, making them particularly effective for processing neuromorphic data. Recent studies have shown that directly trained…

Neural and Evolutionary Computing · Computer Science 2026-01-13 Boxuan Zhang , Zhen Xu , Kuan Tao

What Shapes a Creative Machine Mind? Comprehensively Benchmarking Creativity in Foundation Models

The meteoric rise of foundation models (FMs) has expanded their capabilities far beyond conventional tasks. Creativity, long regarded as a hallmark of human intelligence and a driver of innovation, is now increasingly recognized as a…

Artificial Intelligence · Computer Science 2025-10-07 Zicong He , Boxuan Zhang , Weihao Liu , Ruixiang Tang , Lu Cheng

DyMoDreamer: World Modeling with Dynamic Modulation

A critical bottleneck in deep reinforcement learning (DRL) is sample inefficiency, as training high-performance agents often demands extensive environmental interactions. Model-based reinforcement learning (MBRL) mitigates this by building…

Machine Learning · Computer Science 2025-09-30 Boxuan Zhang , Runqing Wang , Wei Xiao , Weipu Zhang , Jian Sun , Gao Huang , Jie Chen , Gang Wang

S1-MatAgent: A planner driven multi-agent system for material discovery

The discovery of high-performance materials is crucial for technological advancement. Inverse design using multi-agent systems (MAS) shows great potential for new material discovery. However, current MAS for materials research rely on…

Materials Science · Physics 2025-09-19 Xinrui Wang , Chengbo Li , Boxuan Zhang , Jiahui Shi , Nian Ran , Linjing Li , Jianjun Liu , Dajun Zeng

Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report

To understand and identify the unprecedented risks posed by rapidly advancing artificial intelligence (AI) models, this report presents a comprehensive assessment of their frontier risks. Drawing on the E-T-C analysis (deployment…

Artificial Intelligence · Computer Science 2025-07-29 Shanghai AI Lab , : , Xiaoyang Chen , Yunhao Chen , Zeren Chen , Zhiyun Chen , Hanyun Cui , Yawen Duan , Jiaxuan Guo , Qi Guo , Xuhao Hu , Hong Huang , Lige Huang , Chunxiao Li , Juncheng Li , Qihao Lin , Dongrui Liu , Xinmin Liu , Zicheng Liu , Chaochao Lu , Xiaoya Lu , Jingjing Qu , Qibing Ren , Jing Shao , Jingwei Shi , Jingwei Sun , Peng Wang , Weibing Wang , Jia Xu , Lewen Yan , Xiao Yu , Yi Yu , Boxuan Zhang , Jie Zhang , Weichen Zhang , Zhijie Zheng , Tianyi Zhou , Bowen Zhou