Min Lin — Scifaro

Rethinking the Trust Region in LLM Reinforcement Learning

Reinforcement learning (RL) has become a cornerstone for fine-tuning Large Language Models (LLMs), with Proximal Policy Optimization (PPO) serving as the de facto standard algorithm. Despite its ubiquity, we argue that the core ratio…

Machine Learning · Computer Science 2026-05-27 Penghui Qi , Xiangxin Zhou , Zichen Liu , Tianyu Pang , Chao Du , Min Lin , Wee Sun Lee

LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation

Scaling language models to handle longer contexts introduces substantial memory challenges due to the growing cost of key-value (KV) caches. Motivated by the efficiency gains of hybrid models and the broad availability of pretrained large…

Computation and Language · Computer Science 2026-05-19 Xuan Zhang , Fengzhuo Zhang , Cunxiao Du , Chao Du , Tianyu Pang , Wei Gao , Min Lin

PCSTracker: Long-Term Scene Flow Estimation for Point Cloud Sequences

Point cloud scene flow estimation is fundamental to long-term and fine-grained 3D motion analysis. However, existing methods are typically limited to pairwise settings and struggle to maintain temporal consistency over long sequences as…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Min Lin , Gangwei Xu , Xianqi Wang , Yuyi Peng , Xin Yang

EchoVLA: Synergistic Declarative Memory for VLA-Driven Mobile Manipulation

Recent progress in Vision-Language-Action (VLA) models has enabled embodied agents to interpret multimodal instructions and perform complex tasks. However, existing VLAs are mostly confined to short-horizon, table-top manipulation, lacking…

Robotics · Computer Science 2026-03-09 Min Lin , Xiwen Liang , Bingqian Lin , Liu Jingzhi , Zijian Jiao , Kehan Li , Yu Sun , Weijia Liufu , Yuhan Ma , Yuecheng Liu , Shen Zhao , Yuzheng Zhuang , Xiaodan Liang

PromptStereo: Zero-Shot Stereo Matching via Structure and Motion Prompts

Modern stereo matching methods have leveraged monocular depth foundation models to achieve superior zero-shot generalization performance. However, most existing methods primarily focus on extracting robust features for cost volume…

Computer Vision and Pattern Recognition · Computer Science 2026-03-04 Xianqi Wang , Hao Yang , Hangtian Wang , Junda Cheng , Gangwei Xu , Min Lin , Xin Yang

GEM: A Gym for Agentic LLMs

The training paradigm for large language models (LLMs) is moving from static datasets to experience-based learning, where agents acquire skills via interacting with complex environments. To facilitate this transition we introduce GEM…

Machine Learning · Computer Science 2026-03-03 Zichen Liu , Anya Sims , Keyu Duan , Changyu Chen , Simon Yu , Xiangxin Zhou , Haotian Xu , Shaopan Xiong , Bo Liu , Chenmien Tan , Chuen Yang Beh , Weixun Wang , Hao Zhu , Weiyan Shi , Diyi Yang , Michael Shieh , Yee Whye Teh , Wee Sun Lee , Min Lin

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Recent advances in reinforcement learning have shown that language models can develop sophisticated reasoning through training on tasks with verifiable rewards, but these approaches depend on human-curated problem-answer pairs and…

Artificial Intelligence · Computer Science 2026-03-03 Bo Liu , Leon Guertler , Simon Yu , Zichen Liu , Penghui Qi , Daniel Balcells , Mickel Liu , Cheston Tan , Weiyan Shi , Min Lin , Wee Sun Lee , Natasha Jaques

Revisiting Parameter Server in LLM Post-Training

Modern data parallel (DP) training favors collective communication over parameter servers (PS) for its simplicity and efficiency under balanced workloads. However, the balanced workload assumption no longer holds in large language model…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-28 Xinyi Wan , Penghui Qi , Guangxing Huang , Chaoyi Ruan , Min Lin , Jialin Li

A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation

Robotic manipulation faces critical challenges in understanding spatial affordances--the "where" and "how" of object interactions--essential for complex manipulation tasks like wiping a board or stacking objects. Existing methods, including…

Robotics · Computer Science 2026-01-21 Rongtao Xu , Jian Zhang , Minghao Guo , Youpeng Wen , Haoting Yang , Min Lin , Jianzheng Huang , Zhe Li , Kaidong Zhang , Liqiong Wang , Yuxuan Kuang , Meng Cao , Feng Zheng , Xiaodan Liang

AI4X Roadmap: Artificial Intelligence for the advancement of scientific pursuit and its future directions

Artificial intelligence and machine learning are reshaping how we approach scientific discovery, not by replacing established methods but by extending what researchers can probe, predict, and design. In this roadmap we provide a…

Physics and Society · Physics 2025-11-27 Stephen G. Dale , Nikita Kazeev , Alastair J. A. Price , Victor Posligua , Stephan Roche , O. Anatole von Lilienfeld , Konstantin S. Novoselov , Xavier Bresson , Gianmarco Mengaldo , Xudong Chen , Terence J. O'Kane , Emily R. Lines , Matthew J. Allen , Amandine E. Debus , Clayton Miller , Jiayu Zhou , Hiroko H. Dodge , David Rousseau , Andrey Ustyuzhanin , Ziyun Yan , Mario Lanza , Fabio Sciarrino , Ryo Yoshida , Zhidong Leong , Teck Leong Tan , Qianxiao Li , Adil Kabylda , Igor Poltavsky , Alexandre Tkatchenko , Sherif Abdulkader Tawfik , Prathami Divakar Kamath , Theo Jaffrelot Inizan , Kristin A. Persson , Bryant Y. Li , Vir Karan , Chenru Duan , Haojun Jia , Qiyuan Zhao , Hiroyuki Hayashi , Atsuto Seko , Isao Tanaka , Omar M. Yaghi , Tim Gould , Bun Chan , Stefan Vuckovic , Tianbo Li , Min Lin , Zehcen Tang , Yang Li , Yong Xu , Amrita Joshi , Xiaonan Wang , Leonard W. T. Ng , Sergei V. Kalinin , Mahshid Ahmadi , Jiyizhe Zhang , Shuyuan Zhang , Alexei Lapkin , Ming Xiao , Zhe Wu , Kedar Hippalgaonkar , Limsoon Wong , Lorenzo Bastonero , Nicola Marzari , Dorye Luis Esteras Cordoba , Andrei Tomut , Alba Quinones Andrade , Jose-Hugo Garcia

PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly

While vision-language models (VLMs) have demonstrated promising capabilities in reasoning and planning for embodied agents, their ability to comprehend physical phenomena, particularly within structured 3D environments, remains severely…

Robotics · Computer Science 2025-11-24 Liang Ma , Jiajun Wen , Min Lin , Rongtao Xu , Xiwen Liang , Bingqian Lin , Jun Ma , Yongxin Wang , Ziming Wei , Haokun Lin , Mingfei Han , Meng Cao , Bokui Chen , Ivan Laptev , Xiaodan Liang

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Scaling test-time compute is crucial for enhancing the reasoning capabilities of large language models (LLMs). Existing approaches typically employ reinforcement learning (RL) to maximize a verifiable reward obtained at the end of reasoning…

Machine Learning · Computer Science 2025-11-10 Penghui Qi , Zichen Liu , Tianyu Pang , Chao Du , Wee Sun Lee , Min Lin

Defeating the Training-Inference Mismatch via FP16

Reinforcement learning (RL) fine-tuning of large language models (LLMs) often suffers from instability due to the numerical mismatch between the training and inference policies. While prior work has attempted to mitigate this issue through…

Machine Learning · Computer Science 2025-10-31 Penghui Qi , Zichen Liu , Xiangxin Zhou , Tianyu Pang , Chao Du , Wee Sun Lee , Min Lin

Measuring vacancy-type defect density in monolayer semiconductors

Two-dimensional (2D) materials have attracted wide-spread interest due to their unique and tunable properties. Their optoelectronic, mechanical, and thermal properties are greatly influenced by crystal defects, which are, in turn, used to…

Applied Physics · Physics 2025-10-22 Aleksandar Radic , Nick von Jeinsen , Vivian Perez , Ke Wang , Min Lin , Boyao Liu , Yiru Zhu , Ismail Sami , Kenji Watanabe , Takashi Taniguchi , David Ward , Andrew Jardine , Akshay Rao , Manish Chhowalla , Sam Lambrick

Nonparametric Data Attribution for Diffusion Models

Data attribution for generative models seeks to quantify the influence of individual training examples on model outputs. Existing methods for diffusion models typically require access to model gradients or retraining, limiting their…

Machine Learning · Computer Science 2025-10-17 Yutian Zhao , Chao Du , Xiaosen Zheng , Tianyu Pang , Min Lin

Variational Reasoning for Language Models

We introduce a variational reasoning framework for language models that treats thinking traces as latent variables and optimizes them through variational inference. Starting from the evidence lower bound (ELBO), we extend it to a…

Computation and Language · Computer Science 2025-10-16 Xiangxin Zhou , Zichen Liu , Haonan Wang , Chao Du , Min Lin , Chongxuan Li , Liang Wang , Tianyu Pang

Understanding R1-Zero-Like Training: A Critical Perspective

DeepSeek-R1-Zero has shown that reinforcement learning (RL) at scale can directly enhance the reasoning capabilities of LLMs without supervised fine-tuning. In this work, we critically examine R1-Zero-like training by analyzing its two core…

Machine Learning · Computer Science 2025-10-07 Zichen Liu , Changyu Chen , Wenjun Li , Penghui Qi , Tianyu Pang , Chao Du , Wee Sun Lee , Min Lin

DEPTHOR++: Robust Depth Enhancement from a Real-World Lightweight dToF and RGB Guidance

Depth enhancement, which converts raw dToF signals into dense depth maps using RGB guidance, is crucial for improving depth perception in high-precision tasks such as 3D reconstruction and SLAM. However, existing methods often assume ideal…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Jijun Xiang , Longliang Liu , Xuan Zhu , Xianqi Wang , Min Lin , Xin Yang

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

LLMs are often trained with RL from human or AI feedback, yet such methods typically compress nuanced feedback into scalar rewards, discarding much of their richness and inducing scale imbalance. We propose treating verbal feedback as a…

Computation and Language · Computer Science 2025-09-29 Renjie Luo , Zichen Liu , Xiangyan Liu , Chao Du , Min Lin , Wenhu Chen , Wei Lu , Tianyu Pang

Structured Preference Optimization for Vision-Language Long-Horizon Task Planning

Existing methods for vision-language task planning excel in short-horizon tasks but often fall short in complex, long-horizon planning within dynamic environments. These challenges primarily arise from the difficulty of effectively training…

Computer Vision and Pattern Recognition · Computer Science 2025-09-18 Xiwen Liang , Min Lin , Weiqi Ruan , Rongtao Xu , Yuecheng Liu , Jiaqi Chen , Bingqian Lin , Yuzheng Zhuang , Xiaodan Liang