Computer Science

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

Robot manipulation critically depends on perception that preserves the action-relevant aspects of a scene. Yet most robot learning pipelines are built upon visual encoders pre-trained for static recognition or vision-language alignment,…

Robotics · Computer Science 2026-05-29 Jusuk Lee , Seungjae Lee , Jonghun Shin , Hoseong Jung , Sungha Kim , Daesol Cho , H. Jin Kim , Jia-Bin Huang , Furong Huang

Efficient Test-Time Finetuning of LLMs via Convex Reconstruction and Gradient Caching

Test-time finetuning (TTFT) is a rapidly evolving paradigm that adapts a language model to each prompt by retrieving related sequences, updating the model on them, and then evaluating the prompt. However, TTFT is only practical if it is…

Machine Learning · Computer Science 2026-05-29 Alaa Khamis , Alaa Maalouf

Fairness-Aware Federated Learning with Trajectory Shapley Value

Federated learning is an emerging distributed paradigm that addresses the challenges posed by heterogeneous, privacy-sensitive data. It enables multiple clients to train a model collaboratively by aggregating their local updates at a…

Machine Learning · Computer Science 2026-05-29 Daniel Kuznetsov , Ziqi Wang

Majorization precursors to supermodularity and subadditivity on the majorization lattice

We establish two structural majorization relations, which we call precursors, underlying the properties of supermodularity and subadditivity on the lattice induced by majorization. These are precursors in that they immediately imply that…

Information Theory · Computer Science 2026-05-29 Alexander Stévins , Michael G. Jabbour , Serge Deside , Nicolas J. Cerf

When, why, and how do diffusion posterior samplers fail? A finite-sample lens

Diffusion models have excellent capacity to model complex distributions of natural data, which has made them a popular and effective choice for posterior sampling in imaging inverse problems. Existing methods can incorporate any measurement…

Machine Learning · Computer Science 2026-05-29 Benjamin A. Burns , Sara Fridovich-Keil

SoundnessBench: Can Your AI Scientist Really Tell Good Research Ideas from Bad Ones?

Autonomous AI research agents aim to accelerate scientific discovery by automating the research pipeline, from hypothesis generation to peer review. However, existing benchmarks rarely test a fundamental bottleneck: whether Large Language…

Machine Learning · Computer Science 2026-05-29 Sy-Tuyen Ho , Minghui Liu , Huy Nghiem , Furong Huang

Reasoning with Sampling: Cutting at Decision Points

Frontier reasoning models are produced by posttraining base language models with reinforcement learning. Recent work has challenged this by showing that sampling from a sharpened version of the base model's distribution, a so-called power…

Machine Learning · Computer Science 2026-05-29 Felix Zhou , Anay Mehrotra , Quanquan C. Liu

RoboWits: Unexpected Challenges for Robotic Creative Problem Solving

The ability to reason, adapt, and creatively solve problems under unexpected challenges is essential for robots operating in real-world environments. However, current robotic benchmarks primarily emphasize skill-level execution and provide…

Robotics · Computer Science 2026-05-29 Chunru Lin , Hongxin Zhang , Fenghao Yu , Zhehuan Chen , Thomas L. Griffiths , Yejin Choi , David Held , Chuang Gan

In-Context Reward Adaptation for Robust Preference Modeling

Reinforcement Learning from Human Feedback (RLHF) typically relies on static reward models to align Large Language Models with human preferences. However, human values are inherently diverse and heterogeneous, and a single reward model…

Machine Learning · Computer Science 2026-05-29 Zhenyu Sun , Zheng Xu , Ermin Wei

Gram: Assessing sabotage propensities via automated alignment auditing

We introduce Gram, an automated alignment auditing framework to assess the propensity of AI agents to engage in sabotage. We evaluate Gemini models across 17 simulated agentic deployment scenarios that incentivize sabotage. We find Gemini…

Machine Learning · Computer Science 2026-05-29 David Lindner , Victoria Krakovna , Sebastian Farquhar

A Heterogeneous Architecture for Robot RL Beyond GPU-Dominant Paradigms

Simulation-based RL for contemporary robot control is increasingly organized around GPU-resident simulation: physics, rollout collection, and learning are placed on a single GPU-centric execution path. This paradigm has greatly improved…

Robotics · Computer Science 2026-05-29 Yufei Jia , Zhanxiang Cao , Mingrui Yu , Heng Zhang , Shenyu Chen , Dixuan Jiang , Meng Li , Xiaofan Li , Yiyang Liu , Junzhe Wu , Zheng Li , XiLin Fang , Tingyu Cui , Shengcheng Fu , Haoyang Li , Anqi Wang , Zifan Wang , Dongjie Zhu , Chenyu Cao , Zhenbiao Huang , Ziang Zheng , Jie Lu , Xin Ma , Zhengyang Wei , Xiang Zhao , Tianyue Zhan , Ye He , Yuxiang Chen , Yizhou Jiang , Yue Li , Haizhou Ge , Yuhang Dong , Fan Jia , Ziheng Zhang , Meng Zhang , Xiwa Deng , Zhixing Chen , Hanyang Shao , Chenxin Dong , Yixuan Li , Yizhi Chen , Bokui Chen , Kaifeng Zhang , Hanqing Cui , Yusen Qin , Ruqi Huang , Lei Han , Tiancai Wang , Xiang Li , Yue Gao , Guyue Zhou

Self-Trained Verification for Training- and Test-Time Self-Improvement

Self-improvement at scale has been a longstanding goal for reasoning models, and there are two natural places to do it: at test time, through verification-refinement (V-R) loops; and at training time, through self-training methods. Both are…

Machine Learning · Computer Science 2026-05-29 Chen Henry Wu , Aditi Raghunathan

Statistical Embeddings for Similarity, Retrieval, and Interpretable Alignment of Numeric Tabular Datasets

Numeric tabular datasets are the dominant data format in scientific practice, yet large language models lack native mechanisms for representing numeric datasets in a meaningful way across heterogeneous feature spaces. Existing approaches…

Machine Learning · Computer Science 2026-05-29 M. Ross Kunz , John Merickel , Keith Wilson

Gaze2Act: Gaze-Conditioned Vision-Language-Action Policies for Interactive Robot Manipulation

Vision-Language-Action (VLA) models have recently shown strong potential for robot learning by following language instructions. However, in practice, language alone is often insufficient to precisely convey human intent. It is difficult to…

Robotics · Computer Science 2026-05-29 Kuangji Zuo , Gen Li , Bofan Lyu , Yanshuo Lu , Boyu Ma , Shijia Han , Xinyu Zhou , Xichen Yuan , Chuhao Zhou , Jiaqi Bai , Geng Li , Jianfei Yang

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Embodied intelligence is often studied through specialized models for individual tasks such as manipulation or navigation, resulting in fragmented capabilities and limited generalization across tasks, environments, and robot embodiments. In…

Robotics · Computer Science 2026-05-29 Qiuyue Wang , Mingsheng Li , Jian Guan , Jinhui Ye , Sicheng Xie , Yitao Liu , Junhao Chen , Zhixuan Liang , Jie Zhang , Xintong Hu , Xuhong Huang , Pei Lin , Junyang Lin , Dayiheng Liu , Shuai Bai , Jingren Zhou , Jiazhao Zhang , Haoqi Yuan , Gengze Zhou , Hang Yin , Ye Wang , Yiyang Huang , Zixing Lei , Wujian Peng , Delin Chen , Yingming Zheng , Jingyang Fan , Xianwei Zhuang , Xin Zhou , Haoyang Li , Anzhe Chen , Tong Zhang , Xuejing Liu , Yuchong Sun , Ruizhe Chen , Zhaohai Li , Chenxu Lü , Zhibo Yang , Tao Yu , Xionghui Chen

Neural Operator-Based Surrogate Model for CFD:Helical Coil Steam Generator in Small Modular Reactor

Real-time thermal-hydraulic simulation is essential for digital twin (DT) technology that supports the safe and efficient operation of small modular reactors (SMRs). Computational fluid dynamics (CFD) provides high-fidelity flow analysis,…

Machine Learning · Computer Science 2026-05-29 Minseo Lee , Seongmin Oh , Chaehyeon Song , Bumjin Cho , Shilaj Baral , Sangam Khanal , Minseop Song , Joongoo Jeon

Digitally enriching a screening population for pancreatic cancer using routine blood-based measures and clinical histories

Earlier detection of pancreatic cancer is key to enabling wider access to curative treatment and reducing cancer deaths; however, screening is presently not viable. Latent indicators of pathology are evident in an individual's disease and…

Machine Learning · Computer Science 2026-05-29 Chris Varghese , Leo Y. Li-Han , Richa Bisht , Ellen Larson , Frank Lee , Ryan M. Carr , Tanios S. Bekaii-Saab , Shounak Majumder , John D. Halamka , Mark Truty , Ajit H. Goenka , Hojjat Salehinejad , Cornelius A. Thiels

OOD-GraphLLM: Graph Large Language Model for Out-of-Distribution Generalized Drug Synergy Prediction

Drug synergy prediction (DSP) aims to identify efficacious drug combinations under various cellular contexts with different targets. However, the continual emergence of novel compounds results in variations in molecular scaffolds and sizes,…

Machine Learning · Computer Science 2026-05-29 Xin Wang , Linxin Xiao , Yang Yao , Wenwu Zhu

How's it going? Reinforcement learning in language models recruits a functional welfare axis

How does reinforcement learning shape a language model's internal representations? We present evidence that RL recruits a pre-existing representation of functional welfare: an estimate of how well or badly the system is doing, relative to…

Machine Learning · Computer Science 2026-05-29 Andy Q Han , David J. Chalmers , Pavel Izmailov

Anti Mode-Collapse in Mean-Field Transformer via Auxiliary Variables

We use a mean-field-based transformer model to theoretically investigate how auxiliary variables, such as positional encoding, prevent mode collapse of self-attention mechanisms. The use of mean-field transformers to analyze the properties…

Machine Learning · Computer Science 2026-05-29 Masaaki Imaizumi , Masanori Koyama , Noboru Isobe , Kohei Hayashi