Computer Science

Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software

Are AI agents tools, co-authors, or researchers? We present a quantified case study ($N=1$): a physicist supervising an AI coding agent (Claude Code, Sonnet and Opus models) over 12 work days and 57 sessions to build CLAX-PT, a…

Artificial Intelligence · Computer Science 2026-05-29 Nhat-Minh Nguyen

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

Robot manipulation critically depends on perception that preserves the action-relevant aspects of a scene. Yet most robot learning pipelines are built upon visual encoders pre-trained for static recognition or vision-language alignment,…

Robotics · Computer Science 2026-05-29 Jusuk Lee , Seungjae Lee , Jonghun Shin , Hoseong Jung , Sungha Kim , Daesol Cho , H. Jin Kim , Jia-Bin Huang , Furong Huang

SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations

Printed circuit board (PCB) schematic design defines nearly all electronic hardware, but it remains manual and expertise-intensive. While generative AI has advanced digital and analog IC design, PCB schematic generation from…

Artificial Intelligence · Computer Science 2026-05-29 Qinpei Luo , Ruichun Ma , Xinyu Zhang , Lili Qiu

Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection

Recent advances in Vision-Language Models (VLMs) have achieved impressive performance across many tasks, yet prior studies report unsatisfactory performance when applying large language or multimodal models to finding abnormal patterns in…

Artificial Intelligence · Computer Science 2026-05-29 Xiaona Zhou , Muntasir Wahed , Tianjiao Yu , Constantin Brif , Ismini Lourentzou

Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents

Multi-component LLM agents assemble probabilistic claims from components that each see only part of a joint problem; the composition can violate basic probability axioms even when every component is locally coherent. We formalise this…

Artificial Intelligence · Computer Science 2026-05-29 Anany Kotawala

Demystifying Data Organization for Enhanced LLM Training

Large Language Models (LLMs) have revolutionized various fields, yet their training efficiency is heavily reliant on effective data curation. While data selection has been widely studied, the strategic data organization for enhanced…

Artificial Intelligence · Computer Science 2026-05-29 Yalun Dai , Yangyu Huang , Tongshen Yang , Yonghan Wang , Xin Zhang , Wenshan Wu , Qihao Zhao , Hao Li , Yuanyuan Gao , Kim-Hui Yap , Scarlett Li

RoboWits: Unexpected Challenges for Robotic Creative Problem Solving

The ability to reason, adapt, and creatively solve problems under unexpected challenges is essential for robots operating in real-world environments. However, current robotic benchmarks primarily emphasize skill-level execution and provide…

Robotics · Computer Science 2026-05-29 Chunru Lin , Hongxin Zhang , Fenghao Yu , Zhehuan Chen , Thomas L. Griffiths , Yejin Choi , David Held , Chuang Gan

A Heterogeneous Architecture for Robot RL Beyond GPU-Dominant Paradigms

Simulation-based RL for contemporary robot control is increasingly organized around GPU-resident simulation: physics, rollout collection, and learning are placed on a single GPU-centric execution path. This paradigm has greatly improved…

Robotics · Computer Science 2026-05-29 Yufei Jia , Zhanxiang Cao , Mingrui Yu , Heng Zhang , Shenyu Chen , Dixuan Jiang , Meng Li , Xiaofan Li , Yiyang Liu , Junzhe Wu , Zheng Li , XiLin Fang , Tingyu Cui , Shengcheng Fu , Haoyang Li , Anqi Wang , Zifan Wang , Dongjie Zhu , Chenyu Cao , Zhenbiao Huang , Ziang Zheng , Jie Lu , Xin Ma , Zhengyang Wei , Xiang Zhao , Tianyue Zhan , Ye He , Yuxiang Chen , Yizhou Jiang , Yue Li , Haizhou Ge , Yuhang Dong , Fan Jia , Ziheng Zhang , Meng Zhang , Xiwa Deng , Zhixing Chen , Hanyang Shao , Chenxin Dong , Yixuan Li , Yizhi Chen , Bokui Chen , Kaifeng Zhang , Hanqing Cui , Yusen Qin , Ruqi Huang , Lei Han , Tiancai Wang , Xiang Li , Yue Gao , Guyue Zhou

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

Mid-training has become an important stage in modern LLM development, using large-scale curated mixtures to strengthen capabilities before final post-training. Its data selection problem is distinct: the data are optimized under a…

Artificial Intelligence · Computer Science 2026-05-29 Haowen Wang , Yaxin Du , Jian Yang , Jiajun Wu , Shukai Liu , Yuxuan Zhang , Pingjie Wang , Siheng Chen , Tuney Zheng , Ming Zhou , Xianglong Liu

ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure

Scientific discovery is an inherently creative and uncertain process, requiring reasoning beyond the recall of known knowledge. While many benchmarks have been proposed to evaluate large language model (LLM) performance on deep research…

Artificial Intelligence · Computer Science 2026-05-29 A. J. Lew , Y. Cao , M. J. Buehler

mcp-proto-okn: Natural-language access to open scientific knowledge graphs through the Model Context Protocol

MCP Server Proto-OKN (mcp-proto-okn) is a Python-based Model Context Protocol server that enables AI assistants to discover, inspect, query and integrate scientific knowledge graphs through natural language. The server provides graph…

Artificial Intelligence · Computer Science 2026-05-29 Peter W. Rose , Benjamin M. Good , Amanda M. Saravia-Butler , Charlotte A. Nelson , James P. Balhoff , Yaphet Kebede , Patricia L. Whetzel , Christopher Bizon , Andrew I. Su , Sergio E. Baranzini

Gaze2Act: Gaze-Conditioned Vision-Language-Action Policies for Interactive Robot Manipulation

Vision-Language-Action (VLA) models have recently shown strong potential for robot learning by following language instructions. However, in practice, language alone is often insufficient to precisely convey human intent. It is difficult to…

Robotics · Computer Science 2026-05-29 Kuangji Zuo , Gen Li , Bofan Lyu , Yanshuo Lu , Boyu Ma , Shijia Han , Xinyu Zhou , Xichen Yuan , Chuhao Zhou , Jiaqi Bai , Geng Li , Jianfei Yang

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Embodied intelligence is often studied through specialized models for individual tasks such as manipulation or navigation, resulting in fragmented capabilities and limited generalization across tasks, environments, and robot embodiments. In…

Robotics · Computer Science 2026-05-29 Qiuyue Wang , Mingsheng Li , Jian Guan , Jinhui Ye , Sicheng Xie , Yitao Liu , Junhao Chen , Zhixuan Liang , Jie Zhang , Xintong Hu , Xuhong Huang , Pei Lin , Junyang Lin , Dayiheng Liu , Shuai Bai , Jingren Zhou , Jiazhao Zhang , Haoqi Yuan , Gengze Zhou , Hang Yin , Ye Wang , Yiyang Huang , Zixing Lei , Wujian Peng , Delin Chen , Yingming Zheng , Jingyang Fan , Xianwei Zhuang , Xin Zhou , Haoyang Li , Anzhe Chen , Tong Zhang , Xuejing Liu , Yuchong Sun , Ruizhe Chen , Zhaohai Li , Chenxu Lü , Zhibo Yang , Tao Yu , Xionghui Chen

BORA: Bridging Offline Reinforcement Learning and Online Residual Adaptation for Real-World Dexterous VLA Models

Vision-Language-Action (VLA) models have emerged as a promising paradigm for grounding visual-language understanding into real-world robotic manipulation. However, dexterous manipulation remains challenging for VLA policies due to…

Robotics · Computer Science 2026-05-29 Zhongxi Chen , Yifan Han , Yanming Shao , Huanming Liu , Congsheng Xu , Xiaoyu Chen , Yao Mu , Wenzhao Lian

When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

Long-horizon interactions require language models to manage accumulating information: when to update their state, when to preserve their state, and what to ignore. We study this challenge as \textbf{Contextual Belief Management (CBM)}:…

Artificial Intelligence · Computer Science 2026-05-29 Haoming Xu , Weihong Xu , Zongrui Li , Mengru Wang , Yunzhi Yao , Chiyu Wu , Jin Shang , Yu Gong , Shumin Deng

Persona Conditioning of Brand Recommendations in Retrieval-Augmented Commercial Chat: A Prominence-Stratified Cross-Provider Audit

The same prompt -- "best CRM software" -- reaches AI assistants from buyers in widely different contexts: a solo founder, an enterprise VP, a UK SMB owner. We audit how strongly that contextual variation reshapes which brands the model…

Artificial Intelligence · Computer Science 2026-05-29 Will Jack , Noah Lehman , Keller Maloney , Sarah Xu

Double-Edged Sword or Sharp Tool? Designing and Evaluating Triadic LLM-Teacher Collaboration for K-12 Writing at Scale

The double-edged sword of integrating Large Language Models (LLMs) requires an effective triadic collaboration mechanism among LLMs, teachers and students, especially for K-12 education. By developing a triadic collaboration system to…

Artificial Intelligence · Computer Science 2026-05-29 Canran Wang , Yuwen Yang , Zhen Wang , Ming Ma , Ding Yu , Chentai Wang , Keman Huang , Xiaoyong Du

Modularizing Educational LLM-Agency for Fostering Responsible Learning Assistance

The widespread adoption of AI chatbots in education will drastically change learning, making responsible deployment a critical concern. While large language models (LLMs) might have access to sources discussing insights from educational…

Artificial Intelligence · Computer Science 2026-05-29 Julius Gabelmann , Felix Jahn , Kevin Baum , Sophie van Rossum , Emely Wuenscher , Timo P. Gros , Verena Wolf

Unveiling the Visual Counting Bottleneck in Vision-Language Models

While Large Vision-Language Models (VLMs) excel at interpolation, they suffer catastrophic failures in systematic generalization, most notably in visual counting. In this work, we investigate this extrapolation bottleneck by deconstructing…

Multimedia · Computer Science 2026-05-29 Xingzhou Pang , Yifan Hou , Junling Wang , Mrinmaya Sachan

BioRefusalAudit: Auditing Biosecurity Refusal Depth Using General and Domain-Fine-Tuned Sparse Autoencoders

Biosecurity evaluations of language models typically ask whether models produce hazardous output. This paper asks a complementary question: when a model refuses, is that refusal structurally sound, or does it disappear under modest changes…

Artificial Intelligence · Computer Science 2026-05-29 Caleb DeLeeuw