Tianshu Zhang — Scifaro

Do We Really Need External Tools to Mitigate Hallucinations? SIRA: Shared-Prefix Internal Reconstruction of Attribution

Large vision-language models (LVLMs) often hallucinate when language priors dominate weak or ambiguous visual evidence. Existing contrastive decoding methods mitigate this problem by comparing predictions from the original image with those…

Computer Vision and Pattern Recognition · Computer Science 2026-05-15 Tian Qin , Junzhe Chen , Yuqing Shi , Tianshu Zhang , Qiang Ju , Lijie Wen

GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents

We present GLM-5V-Turbo, a step toward native foundation models for multimodal agents. As foundation models are increasingly deployed in real environments, agentic capability depends not only on language reasoning, but also on the ability…

Computer Vision and Pattern Recognition · Computer Science 2026-05-13 V Team , Wenyi Hong , Xiaotao Gu , Ziyang Pan , Zhen Yang , Yuting Wang , Yue Wang , Yuanchang Yue , Yu Wang , Yanling Wang , Yan Wang , Xijun Liu , Wenmeng Yu , Weihan Wang , Wei Li , Shuaiqi Duan , Sheng Yang , Ruiliang Lv , Mingdao Liu , Lihang Pan , Ke Ning , Junhui Ji , Jinjiang Wang , Jing Chen , Jiazheng Xu , Jiale Zhu , Jiale Cheng , Ji Qi , Guobing Gan , Guo Wang , Cong Yao , Zijun Dou , Zihao Zhou , Zihan Wang , Zhiqi Ge , Zhijie Li , Zhenyu Hou , Zhao Xue , Zehui Wang , Zehan Qi , Zehai He , Yutao Zhang , Yusen Liu , Yukuo Cen , Yuchen Li , Yuan Wang , Yu Yang , Yongbin Liu , Yijian Lu , Yifan Xu , Yanzi Wang , Yanxiao Zhao , Yanfeng Wang , Yadong Xue , Yabo Xu , Xinyu Zhang , Xinyu Liu , Xiao Liu , Wenyi Zhao , Wenkai Li , Tianyu Tong , Tianshu Zhang , Shudan Zhang , Shengdong Yan , Qinkai Zheng , Mingde Xu , Licheng Bao , lat Long long , Jiaxing Xu , Jiaxin Fan , Jiawen Qian , Jiali Chen , Jiahui Lin , Jiadai Sun , Haozhi Zheng , Haoran Wang , Haochen Li , Hanyu Lai , Han Xu , Fan Yang , Dan Zhang , Da Yin , Chuangxin Zhao , Chengcheng Wu , Boyan Shi , Bowen Lv , Bowei Jia , Bo Li , Bin Chen , Baoxu Wang , Peng Zhang , Debing Liu , Bin Xu , Juanzi Li , Minlie Huang , Yuxiao Dong , Jie Tang

D3-Gym: Constructing Real-World Verifiable Environments for Data-Driven Discovery

Despite recent progress in language models and agents for scientific data-driven discovery, further advancing their capabilities is held back by the absence of verifiable environments representing real-world scientific tasks. To fill this…

Artificial Intelligence · Computer Science 2026-05-04 Hanane Nour Moussa , Yifei Li , Zhuoyang Li , Yankai Yang , Cheng Tang , Tianshu Zhang , Nesreen K. Ahmed , Ali Payani , Ziru Chen , Huan Sun

SciNav: A General Agent Framework for Scientific Coding Tasks

Autonomous science agents built on large language models (LLMs) are increasingly used to generate hypotheses, design experiments, and produce reports. However, prior work mainly targets open-ended scientific problems with subjective outputs…

Computation and Language · Computer Science 2026-03-24 Tianshu Zhang , Huan Sun

EvoSchema: Towards Text-to-SQL Robustness Against Schema Evolution

Neural text-to-SQL models, which translate natural language questions (NLQs) into SQL queries given a database schema, have achieved remarkable performance. However, database schemas frequently evolve to meet new requirements. Such schema…

Databases · Computer Science 2026-03-12 Tianshu Zhang , Kun Qian , Siddhartha Sahai , Yuan Tian , Shaddy Garg , Huan Sun , Yunyao Li

GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

We present GLM-4.1V-Thinking, GLM-4.5V, and GLM-4.6V, a family of vision-language models (VLMs) designed to advance general-purpose multimodal understanding and reasoning. In this report, we share our key findings in the development of the…

Computer Vision and Pattern Recognition · Computer Science 2026-01-05 V Team , Wenyi Hong , Wenmeng Yu , Xiaotao Gu , Guo Wang , Guobing Gan , Haomiao Tang , Jiale Cheng , Ji Qi , Junhui Ji , Lihang Pan , Shuaiqi Duan , Weihan Wang , Yan Wang , Yean Cheng , Zehai He , Zhe Su , Zhen Yang , Ziyang Pan , Aohan Zeng , Baoxu Wang , Bin Chen , Boyan Shi , Changyu Pang , Chenhui Zhang , Da Yin , Fan Yang , Guoqing Chen , Haochen Li , Jiale Zhu , Jiali Chen , Jiaxing Xu , Jiazheng Xu , Jing Chen , Jinghao Lin , Jinhao Chen , Jinjiang Wang , Junjie Chen , Leqi Lei , Letian Gong , Leyi Pan , Mingdao Liu , Mingde Xu , Mingzhi Zhang , Qinkai Zheng , Ruiliang Lyu , Shangqin Tu , Sheng Yang , Shengbiao Meng , Shi Zhong , Shiyu Huang , Shuyuan Zhao , Siyan Xue , Tianshu Zhang , Tianwei Luo , Tianxiang Hao , Tianyu Tong , Wei Jia , Wenkai Li , Xiao Liu , Xiaohan Zhang , Xin Lyu , Xinyu Zhang , Xinyue Fan , Xuancheng Huang , Yadong Xue , Yanfeng Wang , Yanling Wang , Yanzi Wang , Yifan An , Yifan Du , Yiheng Huang , Yilin Niu , Yiming Shi , Yu Wang , Yuan Wang , Yuanchang Yue , Yuchen Li , Yusen Liu , Yutao Zhang , Yuting Wang , Yuxuan Zhang , Zhao Xue , Zhengxiao Du , Zhenyu Hou , Zihan Wang , Peng Zhang , Debing Liu , Bin Xu , Juanzi Li , Minlie Huang , Yuxiao Dong , Jie Tang

OmniDPO: A Preference Optimization Framework to Address Omni-Modal Hallucination

Recently, Omni-modal large language models (OLLMs) have sparked a new wave of research, achieving impressive results in tasks such as audio-video understanding and real-time environment perception. However, hallucination issues still…

Artificial Intelligence · Computer Science 2025-09-03 Junzhe Chen , Tianshu Zhang , Shiyu Huang , Yuwei Niu , Chao Sun , Rongzhou Zhang , Guanyu Zhou , Lijie Wen , Xuming Hu

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

Agentic search such as Deep Research systems-where agents autonomously browse the web, synthesize information, and return comprehensive citation-backed answers-represents a major shift in how users interact with web-scale information. While…

Artificial Intelligence · Computer Science 2025-07-04 Boyu Gou , Zanming Huang , Yuting Ning , Yu Gu , Michael Lin , Weijian Qi , Andrei Kopanev , Botao Yu , Bernal Jiménez Gutiérrez , Yiheng Shu , Chan Hee Song , Jiaman Wu , Shijie Chen , Hanane Nour Moussa , Tianshu Zhang , Jian Xie , Yifei Li , Tianci Xue , Zeyi Liao , Kai Zhang , Boyuan Zheng , Zhaowei Cai , Viktor Rozgic , Morteza Ziyadi , Huan Sun , Yu Su

Ambient Backscatter Communication in LTE Uplink Sounding Reference Signal

Ambient Internet of Things (AIoT), recently standardized by the 3rd Generation Partnership Project (3GPP), demands a low-power wide-area communication solution that operates several orders of magnitude below the power requirements of…

Signal Processing · Electrical Eng. & Systems 2025-01-22 Jingyi Liao , Tianshu Zhang , Kalle Ruttik , Riku Jäntti , Dinh-Thuy Phan-Huy

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models

Despite the recent breakthroughs achieved by Large Vision Language Models (LVLMs) in understanding and responding to complex visual-textual contexts, their inherent hallucination tendencies limit their practical application in real-world…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Junzhe Chen , Tianshu Zhang , Shiyu Huang , Yuwei Niu , Linfeng Zhang , Lijie Wen , Xuming Hu

SeisGPT: A Physics-Informed Data-Driven Large Model for Real-Time Seismic Response Prediction

Accurately predicting the dynamic responses of building structures under seismic loads is essential for ensuring structural safety and minimizing potential damage. This critical aspect of structural analysis allows engineers to evaluate how…

Computational Engineering, Finance, and Science · Computer Science 2024-10-29 Shiqiao Meng , Ying Zhou , Qinghua Zheng , Bingxu Liao , Mushi Chang , Tianshu Zhang , Abderrahim Djerrad

S4DL: Shift-sensitive Spatial-Spectral Disentangling Learning for Hyperspectral Image Unsupervised Domain Adaptation

Unsupervised domain adaptation techniques, extensively studied in hyperspectral image (HSI) classification, aim to use labeled source domain data and unlabeled target domain data to learn domain invariant features for cross-scene…

Computer Vision and Pattern Recognition · Computer Science 2024-08-29 Jie Feng , Tianshu Zhang , Junpeng Zhang , Ronghua Shang , Weisheng Dong , Guangming Shi , Licheng Jiao

TableLlama: Towards Open Large Generalist Models for Tables

Semi-structured tables are ubiquitous. There has been a variety of tasks that aim to automatically interpret, augment, and query tables. Current methods often require pretraining on tables or special model architecture design, are…

Computation and Language · Computer Science 2024-04-08 Tianshu Zhang , Xiang Yue , Yifei Li , Huan Sun

Few-shot Adaptation of Multi-modal Foundation Models: A Survey

Multi-modal (vision-language) models, such as CLIP, are replacing traditional supervised pre-training models (e.g., ImageNet-based pre-training) as the new generation of visual foundation models. These models with robust and aligned…

Computer Vision and Pattern Recognition · Computer Science 2024-01-05 Fan Liu , Tianshu Zhang , Wenwen Dai , Wenwen Cai , Xiaocong Zhou , Delong Chen

Exploring Chain-of-Thought Style Prompting for Text-to-SQL

In-context learning with large language models (LLMs) has recently caught increasing attention due to its superior few-shot performance on various tasks. However, its performance on text-to-SQL parsing still has much room for improvement.…

Computation and Language · Computer Science 2023-10-30 Chang-You Tai , Ziru Chen , Tianshu Zhang , Xiang Deng , Huan Sun

Roll Up Your Sleeves: Working with a Collaborative and Engaging Task-Oriented Dialogue System

We introduce TacoBot, a user-centered task-oriented digital assistant designed to guide users through complex real-world tasks with multiple steps. Covering a wide range of cooking and how-to tasks, we aim to deliver a collaborative and…

Computation and Language · Computer Science 2023-08-01 Lingbo Mo , Shijie Chen , Ziru Chen , Xiang Deng , Ashley Lewis , Sunit Singh , Samuel Stevens , Chang-You Tai , Zhen Wang , Xiang Yue , Tianshu Zhang , Yu Su , Huan Sun

Federated Learning for Semantic Parsing: Task Formulation, Evaluation Setup, New Algorithms

This paper studies a new task of federated learning (FL) for semantic parsing, where multiple clients collaboratively train one global model without sharing their semantic parsing data. By leveraging data from multiple clients, the FL…

Computation and Language · Computer Science 2023-05-30 Tianshu Zhang , Changchang Liu , Wei-Han Lee , Yu Su , Huan Sun

Bootstrapping a User-Centered Task-Oriented Dialogue System

We present TacoBot, a task-oriented dialogue system built for the inaugural Alexa Prize TaskBot Challenge, which assists users in completing multi-step cooking and home improvement tasks. TacoBot is designed with a user-centered principle…

Computation and Language · Computer Science 2022-07-22 Shijie Chen , Ziru Chen , Xiang Deng , Ashley Lewis , Lingbo Mo , Samuel Stevens , Zhen Wang , Xiang Yue , Tianshu Zhang , Yu Su , Huan Sun

Dynamic Multi-Person Mesh Recovery From Uncalibrated Multi-View Cameras

Dynamic multi-person mesh recovery has been a hot topic in 3D vision recently. However, few works focus on the multi-person motion capture from uncalibrated cameras, which mainly faces two challenges: the one is that inter-person…

Computer Vision and Pattern Recognition · Computer Science 2022-06-23 Buzhen Huang , Yuan Shu , Tianshu Zhang , Yangang Wang