Yu Fu — Scifaro

StepAudio 2.5 Technical Report

Unified audio-language modeling has emerged as a prominent trend in modern speech systems, promising to bring the reasoning capabilities of large language models to auditory tasks. However, existing unified foundations often struggle to…

Audio and Speech Processing · Electrical Eng. & Systems 2026-05-25 Bin Lin , Bo Zhao , Boyong Wu , Chao Yan , Chen Wu , Cheng Yi , Chengyuan Yao , Daijiao Liu , Fei Tian , Feng Tian , Haiyang Sun , Haoyang Zhang , Jiangjie Zhen , Jinglan Gong , Jun Chen , Li Xie , Peilin Li , Peng Yang , Pengfei Tan , Qingjian Lin , Runze Li , Shenghua Hu , Siyi Zhou , Wenwen Qu , Xiangyu Li , Xiangyu Tony Zhang , Xuerui Yang , Yang Yang , Yechang Huang , Yu Fu , Yuchu Luo , Yuxin Li , Yuxin Zhang , Zhengyan Sheng , Brian Li , Chang Zeng , Changlin Zhang , Chen Geng , Chenghao Dong , Chengli Feng , Dan Zhou , Danni Wan , Di Chen , Die Zhang , Dongqing Pang , Guanglong Yang , Guoqiang Hu , Huangxi Zhu , Jianzheng Gao , Jinghua Liang , Jinmei Wan , Junjie Yuan , Kang An , Lei Lei , Limin Zhong , Lun Cai , Mengqiang Ren , Min Xu , Mingliang Li , Mingxiao Li , Na Wang , Qiang Tong , Qiaoling Huang , Qingfu Du , Rui Wang , Shengchen Zhou , Shi Qiu , Shihao Peng , Shiliang Yang , Siqi Tu , Tianjiao Deng , Ting Xu , Tong Wang , WeiMing Niu , Wuxun Xie , Xianwei Zhang , Xianyu Feng , Xiaojia Liu , Xing Chen , Xiongbin Wu , Yan Wu , Yang Li , Yi Liu , Yifan Zhang , Yile Liu , Yongshen Long , Yu Luo , Yuanhao Ding , Yuhao Wang , Yuhe Yin , Yunfang Xu , Yuxiang Yang , Zhiguo Huang , Zhiyue Wu , Zichao Li , Zichao Zhou , Daxin Jiang , Future Li , Gang Yu , Xiangyu Zhang , Yibo Zhu

Reducing the Safety Tax in LLM Safety Alignment with On-Policy Self-Distillation

Safety alignment often improves robustness to harmful queries at the cost of reasoning ability, a tradeoff known as the safety tax. A common cause is distributional mismatch: supervised fine-tuning trains the target model on safety…

Machine Learning · Computer Science 2026-05-19 Yu Fu , Longxuan Yu , Haz Sameen Shahgir , Zhipeng Wei , Hui Liu , N. Benjamin Erichson , Yue Dong

GRLO: Towards Generalizable Reinforcement Learning in Open-Ended Environments from Zero

Post-training has become a crucial step for unlocking the capabilities of large language models, with reinforcement learning (RL) emerging as a critical paradigm. Recent RL-based post-training has increasingly split into two paradigms:…

Machine Learning · Computer Science 2026-05-18 Shangjian Yin , Yu Fu , Yue Dong , Zhouxing Shi

Do Reasoning LLMs Refuse What They Infer in Long Contexts?

Long-context LLMs can infer objectives that are not stated explicitly. This capability is useful for reasoning over documents, code, retrieved evidence, and tool traces, but it also creates a safety risk: harmful intent can be distributed…

Computation and Language · Computer Science 2026-05-15 Yu Fu , Haz Sameen Shahgir , Huanli Gong , Zhipeng Wei , N. Benjamin Erichson , Yue Dong

AllocMV: Optimal Resource Allocation for Music Video Generation via Structured Persistent State

Generating long-horizon music videos (MVs) is frequently constrained by prohibitive computational costs and difficulty maintaining cross-shot consistency. We propose AllocMV, a hierarchical framework formulating music video synthesis as a…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Huimin Wang , Leilei Ouyang , Chang Xia , Yongqi Kang , Yu Fu , Yuqi Ouyang

SemiSAM-O1: How far can we push the boundary of annotation-efficient medical image segmentation?

Semi-supervised learning (SSL) has become a promising solution to alleviate the annotation burden of deep learning-based medical image segmentation models. While recent advances in foundation model-driven SSL have pushed the boundary to…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Yichi Zhang , Le Xue , Bichun Xu , Judong Luo , Zhigang Wu , Yu Fu , Zixin Hu , Yuan Cheng , Yuan Qi

From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company

Individual agent capabilities have advanced rapidly through modular skills and tool integrations, yet multi-agent systems remain constrained by fixed team structures, tightly coupled coordination logic, and session-bound learning. We argue…

Artificial Intelligence · Computer Science 2026-04-27 Zhengxu Yu , Yu Fu , Zhiyuan He , Yuxuan Huang , Lee Ka Yiu , Meng Fang , Weilin Luo , Jun Wang

The exceptional set for Diophantine inequality with mixed powers of primes

Assume that $\lambda_1, \lambda_2, \lambda_3,\lambda_4,\lambda_5,\lambda_6,\lambda_7$ are non-zero real numbers , $\lambda_1/\lambda_2$ is an irrational number. Let $\mathcal{V} $ be a well-spaced sequence, and $\delta >0$. For any given…

Number Theory · Mathematics 2026-04-27 Yu Fu , Linzhu Fu , Liqun Hu

VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors

Vision-language models (VLMs) have achieved impressive performance across a wide range of multimodal tasks. However, they often fail on tasks that require fine-grained visual perception, even when the required information is still present…

Computer Vision and Pattern Recognition · Computer Science 2026-04-16 Haz Sameen Shahgir , Xiaofu Chen , Yu Fu , Erfan Shayegani , Nael Abu-Ghazaleh , Yova Kementchedjhieva , Yue Dong

In-vivo entropy production of A. subaru

Entropy production is often used as a proxy for energy consumption of a non-equilibrium system. Lower bounds can be estimated from coarse-grained observations, and this has been done for various biological systems. Here, we apply these…

Biological Physics · Physics 2026-04-02 Yu Fu , Emmy Dobson , Benjamin B. Machta , Michael C. Abbott

Is Mathematical Problem-Solving Expertise in Large Language Models Associated with Assessment Performance?

Large Language Models (LLMs) are increasingly used in math education not only as problem solvers but also as assessors of learners' reasoning. However, it remains unclear whether stronger math problem-solving ability is associated with…

Artificial Intelligence · Computer Science 2026-03-27 Liang Zhang , Yu Fu , Xinyi Jin

PET-F2I: A Comprehensive Benchmark and Parameter-Efficient Fine-Tuning of LLMs for PET/CT Report Impression Generation

PET/CT imaging is pivotal in oncology and nuclear medicine, yet summarizing complex findings into precise diagnostic impressions is labor-intensive. While LLMs have shown promise in medical text generation, their capability in the highly…

Computer Vision and Pattern Recognition · Computer Science 2026-03-12 Yuchen Liu , Wenbo Zhang , Liling Peng , Yichi Zhang , Yu Fu , Xin Guo , Chao Qu , Yuan Qi , Le Xue

From Flow to One Step: Real-Time Multi-Modal Trajectory Policies via Implicit Maximum Likelihood Estimation-based Distribution Distillation

Generative policies based on diffusion and flow matching achieve strong performance in robotic manipulation by modeling multi-modal human demonstrations. However, their reliance on iterative Ordinary Differential Equation (ODE) integration…

Robotics · Computer Science 2026-03-11 Ju Dong , Liding Zhang , Lei Zhang , Yu Fu , Kaixin Bai , Zoltan-Csaba Marton , Zhenshan Bing , Zhaopeng Chen , Alois Christian Knoll , Jianwei Zhang

M4Diffuser: Multi-View Diffusion Policy with Manipulability-Aware Control for Robust Mobile Manipulation

Mobile manipulation requires the coordinated control of a mobile base and a robotic arm while simultaneously perceiving both global scene context and fine-grained object details. Existing single-view approaches often fail in unstructured…

Robotics · Computer Science 2026-03-10 Ju Dong , Lei Zhang , Liding Zhang , Yao Ling , Yu Fu , Kaixin Bai , Zoltán-Csaba Márton , Zhenshan Bing , Zhaopeng Chen , Alois Christian Knoll , Jianwei Zhang

MOSAIC: A Unified Platform for Cross-Paradigm Comparison and Evaluation of Homogeneous and Heterogeneous Multi-Agent RL, LLM, VLM, and Human Decision-Makers

Reinforcement learning (RL), large language models (LLMs), and vision-language models (VLMs) have been widely studied in isolation. However, existing infrastructure lacks the ability to deploy agents from different decision-making paradigms…

Machine Learning · Computer Science 2026-03-03 Abdulhamid M. Mousa , Yu Fu , Rakhmonberdi Khajiev , Jalaledin M. Azzabi , Abdulkarim M. Mousa , Peng Yang , Yunusa Haruna , Ming Liu

Overton Pluralistic Reinforcement Learning for Large Language Models

Existing alignment paradigms remain limited in capturing the pluralistic nature of human values. Overton Pluralism addresses this gap by generating responses with diverse perspectives from a single query. This paper introduces OP-GRPO…

Computation and Language · Computer Science 2026-02-25 Yu Fu , Seongho Son , Ilija Bogunovic

CT Synthesis with Conditional Diffusion Models for Abdominal Lymph Node Segmentation

Despite the significant success achieved by deep learning methods in medical image segmentation, researchers still struggle in the computer-aided diagnosis of abdominal lymph nodes due to the complex abdominal environment, small and…

Image and Video Processing · Electrical Eng. & Systems 2026-02-13 Yongrui Yu , Hanyu Chen , Zitian Zhang , Qiong Xiao , Wenhui Lei , Linrui Dai , Yu Fu , Hui Tan , Guan Wang , Peng Gao , Xiaofan Zhang

SurfAge-Net: A Hierarchical Surface-Based Network for Interpretable Fine-Grained Brain Age Prediction

Brain age prediction serves as a powerful framework for assessing brain status and detecting deviations associated with neurodevelopmental and neurodegenerative disorders. However, most existing approaches emphasize whole-brain age…

Neurons and Cognition · Quantitative Biology 2026-02-10 Rongzhao He , Dalin Zhu , Ying Wang , Songhong Yue , Leilei Zhao , Yu Fu , Dan Wu , Bin Hu , Weihao Zheng

Harnessing the Unseen: The Hidden Influence of Intrinsic Knowledge in Long-Context Language Models

Recent advances in long-context language models (LCLMs), designed to handle extremely long contexts, primarily focus on utilizing external contextual information, often leaving the influence of language models' parametric knowledge…

Computation and Language · Computer Science 2026-02-09 Yu Fu , Haz Sameen Shahgir , Hui Liu , Xianfeng Tang , Qi He , Yue Dong

LingLanMiDian: Systematic Evaluation of LLMs on TCM Knowledge and Clinical Reasoning

Large language models (LLMs) are advancing rapidly in medical NLP, yet Traditional Chinese Medicine (TCM) with its distinctive ontology, terminology, and reasoning patterns requires domain-faithful evaluation. Existing TCM benchmarks are…

Artificial Intelligence · Computer Science 2026-02-03 Rui Hua , Yu Wei , Zixin Shu , Kai Chang , Dengying Yan , Jianan Xia , Zeyu Liu , Hui Zhu , Shujie Song , Mingzhong Xiao , Xiaodong Li , Dongmei Jia , Zhuye Gao , Yanyan Meng , Naixuan Zhao , Yu Fu , Haibin Yu , Benman Yu , Yuanyuan Chen , Fei Dong , Zhizhou Meng , Pengcheng Yang , Songxue Zhao , Lijuan Pei , Yunhui Hu , Kan Ding , Jiayuan Duan , Wenmao Yin , Yang Gu , Runshun Zhang , Qiang Zhu , Jian Yu , Jiansheng Li , Baoyan Liu , Wenjia Wang , Xuezhong Zhou