Jiaming Zhang — Scifaro

RHO: Robust Holistic OSM-Based Metric Cross-View Geo-Localization

Metric Cross-View Geo-Localization (MCVGL) aims to estimate the 3-DoF camera pose (position and heading) by matching ground and satellite images. In this work, instead of pinhole and satellite images, we study robust MCVGL using holistic…

Computer Vision and Pattern Recognition · Computer Science 2026-05-29 Junwei Zheng , Ruize Dai , Ruiping Liu , Zichao Zeng , Yufan Chen , Fangjinhua Wang , Kunyu Peng , Kailun Yang , Jiaming Zhang , Rainer Stiefelhagen

Beyond Chunk-Local Extraction: Cross-Chunk Graph Augmentation for GraphRAG

GraphRAG extends retrieval-augmented generation by organizing corpora as explicit knowledge graphs, enabling graph-based retrieval for complex question answering. However, existing frameworks extract entities and relations within individual…

Computation and Language · Computer Science 2026-05-28 Jiaming Zhang , Yibo Zhao , Jing Yu , Jianxiang Yu , Xiang Li

GlobalDentBench: A Multinational Benchmark for Evaluating LLM Clinical Reasoning in Dentistry with Expert Calibration

While large language models (LLMs) hold transformative potential for medicine, their reasoning robustness and safety in real-world clinical scenarios remain critically underexplored, particularly in dentistry. Here we introduce…

Artificial Intelligence · Computer Science 2026-05-27 Junjie Zhao , Jingyi Liang , Zhenyang Cai , Jiaming Zhang , Zhenwei Wen , Shuzhi Deng , Wenjing Yi , Chunfeng Luo , Hexian Zhang , Junying Chen , Tianrui Liu , Zhuhui Bai , Zixu Zhang , Pradeep Singh , Xiang Liu , Jianquan Li , Nhan L Tran , Falk Schwendicke , Zuolin Jin , Lijian Jin , Liangyi Chen , Wei-fa Yang , Benyou Wang , Junwen Wang , Shan Jiang

LFX: Towards Unified Light Field Dense Semantic Segmentation and Salient Object Detection

Light field cameras capture multi-view observations within a single exposure. However, existing studies are typically tailored to specific LF representations, leaving the field without a unified learning framework. To bridge this gap, we…

Computer Vision and Pattern Recognition · Computer Science 2026-05-22 Fei Teng , Lingxin Huang , Buyin Deng , Kai Luo , Boyuan Zheng , Zheng Fang , Hong Zheng , Kunyu Peng , Jiaming Zhang , Yaonan Wang , Kailun Yang

Faster or Stronger: Towards Flexible Visual Place Recognition via Weighted Aggregation and Token Pruning

Visual Place Recognition (VPR) aims to match a query image to reference images of the same place in a large-scale database. Recent state-of-the-art methods employ Vision Transformers (ViTs) as backbone foundation models to extract…

Computer Vision and Pattern Recognition · Computer Science 2026-05-21 Zichao Zeng , June Moh Goo , Junwei Zheng , Weijia Fan , Jiaming Zhang , Rainer Stiefelhagen , Jan Boehm

A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook

The foundational capabilities established by Large Language Models (LLMs) have paved the way for Multimodal Large Language Models (MLLMs), within which Large Audio Language Models (LALMs) are essential for realizing universal auditory…

Sound · Computer Science 2026-05-21 Kaiwen Luo , Zhenhong Zhou , Leo Wang , Liang Lin , Yang Xiao , Tianyu Shao , Yuanhe Zhang , Yuxuan Li , Miao Yu , Kailin Lyu , Jiaming Zhang , Dongrui Liu , Li Sun , Yueming Wu , Kai Li , Ting Dang , Xiaojun Jia , Rohan Kumar Das , Xinfeng Li , Siyuan Liang , Qiufeng Wang , Xingjun Ma , Jing Chen , Kun Wang , Junhao Dong , Deqing Zou , Yu Cheng , Xia Hu , Zhigang Zeng , Sen Su , Yang Liu , Yu-Gang Jiang , Philip S. Yu , Yew-Soon Ong

DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models

While vision and multimodal foundation models underpin critical tasks from perception to complex reasoning, they remain highly vulnerable to adversarial attacks. However, traditional adversarial attacks are typically limited to single,…

Cryptography and Security · Computer Science 2026-05-20 Ye Sun , Xin Wang , Jiaming Zhang , Yifeng Gao , Yixu Wang , Yifan Ding , Qixian Zhang , Henghui Ding , Xingjun Ma , Yu-Gang Jiang

EgoExoMem: Cross-View Memory Reasoning over Synchronized Egocentric and Exocentric Videos

Egocentric memory is widely used in embodied intelligence, but it may be insufficient for comprehensive spatial-temporal reasoning. Inspired by human recall from both field and observer perspectives, we introduce EgoExoMem, the first…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Ruiping Liu , Junwei Zheng , Yufan Chen , Di Wen , Shaofang Quan , Chengzhi Wu , Jiaming Zhang , Kailun Yang , Kunyu Peng , Rainer Stiefelhagen

TAME: Test-Time Adversarial Prompt Tuning via Mixture-of-Experts for Vision-Language Models

Large-scale pre-trained Vision-Language models (VLMs), such as CLIP, exhibit strong zero-shot generalization, yet remain highly vulnerable to imperceptible adversarial perturbations, raising serious safety concerns for open-world…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Xin Wang , Yixu Wang , Jiaming Zhang , Ruofan Wang , Jiaqi Yu , Kai Chen , Jingjing Chen , Xingjun Ma , Yu-Gang Jiang

SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation

Preserving first-frame identity while ensuring precise motion control is a fundamental challenge in human image animation. The Image-to-Motion Binding process of the dominant Reference-to-Video (R2V) paradigm overlooks critical…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Jiaming Zhang , Shengming Cao , Rui Li , Xiaotong Zhao , Yutao Cui , Xinglin Hou , Gangshan Wu , Haolan Chen , Yu Xu , Limin Wang , Kai Ma

SAM 2++: Tracking Anything at Any Granularity

Due to the varying granularity of target states across different tasks, most existing trackers are tailored to a single task, which specificity limits their generalization, preventing them from effectively utilizing multi-task training data…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Jiaming Zhang , Cheng Liang , Yichun Yang , Chenkai Zeng , Yutao Cui , Xinwen Zhang , Xin Zhou , Kai Ma , Gangshan Wu , Limin Wang

XDomainBench: Diagnosing Reasoning Collapse in High-Dimensional Scientific Knowledge Composition

Large Language Models (LLMs) are increasingly deployed for knowledge synthesis, yet their capacity for compositional generalization in scientific knowledge remains under-characterized. Existing benchmarks primarily focus on single-turn…

Artificial Intelligence · Computer Science 2026-05-15 Gong Zhiren , Tiantong Wu , Jiaming Zhang , Fuyao Zhang , Che Wang , Yurong Hao , Yikun Hou , Foo Ping , Yilei Zhao , Fei Huang , Chau Yuen , Wei Yang Bryan Lim

Position: Assistive Agents Need Accessibility Alignment

Assistive agents for Blind and Visually Impaired (BVI) users require accessibility alignment as a first-class design objective. Despite rapid progress in agentic AI, most systems are designed and evaluated under assumptions of sighted…

Artificial Intelligence · Computer Science 2026-05-14 Jie Hu , Changyuan Yan , Yu Zheng , Ziqian Wang , Jiaming Zhang

Novel GPU Boruta algorithms for feature selection from high-dimensional data

Most feature selection algorithms, especially wrapper methods, run inefficiently on CPU based platforms because of their high computational complexity. This inefficiency makes them unsuitable for processing large scale datasets. To address…

Machine Learning · Computer Science 2026-05-12 Xurui Li , Zhiguo Gan , Jiaming Zhang , Zheng Liu , Diannan Lu

FraudBench: A Multimodal Benchmark for Detecting AI-Generated Fraudulent Refund Evidence

Artificial Intelligence (AI)-generated images have become increasingly realistic and readily adaptable to concrete real-world claims, creating new challenges for verifying visual evidence. A concrete emerging risk is AI-generated refund…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Xinyu Yan , Boyang Chen , Jiaming Zhang , Tiantong Wu , Hong Xi Tae , Yichen He , Tiantong Wang , Yachun Mi , Yurong Hao , Yilei Zhao , Lei Xiao , Longtao Huang , Pengjun Xie , Wei Liu , Wei Yang Bryan Lim

Augmented Lagrangian Multiplier Network for State-wise Safety in Reinforcement Learning

Safety is a primary challenge in real-world reinforcement learning (RL). Formulating safety requirements as state-wise constraints has become a prominent paradigm. Handling state-wise constraints with the Lagrangian method requires a…

Machine Learning · Computer Science 2026-05-04 Jiaming Zhang , Yujie Yang , Yao Lyu , Shengbo Eben Li , Liping Zhang

For-Value: Efficient Forward-Only Data Valuation for finetuning LLMs and VLMs

Data valuation is essential for enhancing the transparency and accountability of large language models (LLMs) and vision-language models (VLMs). However, existing methods typically rely on gradient computations, making them computationally…

Computation and Language · Computer Science 2026-04-28 Wenlong Deng , Qi Zeng , Jiaming Zhang , Minghui Chen , Zixin Ding , Christos Thrampoulidis , Boying Gong , Xiaoxiao Li

Benign Overfitting in Adversarial Training for Vision Transformers

Despite the remarkable success of Vision Transformers (ViTs) across a wide range of vision tasks, recent studies have revealed that they remain vulnerable to adversarial examples, much like Convolutional Neural Networks (CNNs). A common…

Machine Learning · Computer Science 2026-04-22 Jiaming Zhang , Meng Ding , Shaopeng Fu , Jingfeng Zhang , Di Wang

ProjLens: Unveiling the Role of Projectors in Multimodal Model Safety

Multimodal Large Language Models (MLLMs) have achieved remarkable success in cross-modal understanding and generation, yet their deployment is threatened by critical safety vulnerabilities. While prior works have demonstrated the…

Cryptography and Security · Computer Science 2026-04-22 Kun Wang , Cheng Qian , Miao Yu , Lilan Peng , Liang Lin , Jiaming Zhang , Tianyu Zhang , Yu Cheng , Yang Wang

Safety at Scale: A Comprehensive Survey of Large Model and Agent Safety

The rapid advancement of large models, driven by their exceptional abilities in learning and generalization through large-scale pre-training, has reshaped the landscape of Artificial Intelligence (AI). These models are now foundational to a…

Cryptography and Security · Computer Science 2026-04-15 Xingjun Ma , Yifeng Gao , Yixu Wang , Ruofan Wang , Xin Wang , Ye Sun , Yifan Ding , Hengyuan Xu , Yunhao Chen , Yunhan Zhao , Hanxun Huang , Yige Li , Yutao Wu , Jiaming Zhang , Xiang Zheng , Yang Bai , Zuxuan Wu , Xipeng Qiu , Jingfeng Zhang , Yiming Li , Xudong Han , Haonan Li , Jun Sun , Cong Wang , Jindong Gu , Baoyuan Wu , Siheng Chen , Tianwei Zhang , Yang Liu , Mingming Gong , Tongliang Liu , Shirui Pan , Cihang Xie , Tianyu Pang , Yinpeng Dong , Ruoxi Jia , Yang Zhang , Shiqing Ma , Xiangyu Zhang , Neil Gong , Chaowei Xiao , Sarah Erfani , Tim Baldwin , Bo Li , Masashi Sugiyama , Dacheng Tao , James Bailey , Yu-Gang Jiang