Computer Science

Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software

Are AI agents tools, co-authors, or researchers? We present a quantified case study ($N=1$): a physicist supervising an AI coding agent (Claude Code, Sonnet and Opus models) over 12 work days and 57 sessions to build CLAX-PT, a…

Artificial Intelligence · Computer Science 2026-05-29 Nhat-Minh Nguyen

SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations

Printed circuit board (PCB) schematic design defines nearly all electronic hardware, but it remains manual and expertise-intensive. While generative AI has advanced digital and analog IC design, PCB schematic generation from…

Artificial Intelligence · Computer Science 2026-05-29 Qinpei Luo , Ruichun Ma , Xinyu Zhang , Lili Qiu

Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection

Recent advances in Vision-Language Models (VLMs) have achieved impressive performance across many tasks, yet prior studies report unsatisfactory performance when applying large language or multimodal models to finding abnormal patterns in…

Artificial Intelligence · Computer Science 2026-05-29 Xiaona Zhou , Muntasir Wahed , Tianjiao Yu , Constantin Brif , Ismini Lourentzou

Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents

Multi-component LLM agents assemble probabilistic claims from components that each see only part of a joint problem; the composition can violate basic probability axioms even when every component is locally coherent. We formalise this…

Artificial Intelligence · Computer Science 2026-05-29 Anany Kotawala

Demystifying Data Organization for Enhanced LLM Training

Large Language Models (LLMs) have revolutionized various fields, yet their training efficiency is heavily reliant on effective data curation. While data selection has been widely studied, the strategic data organization for enhanced…

Artificial Intelligence · Computer Science 2026-05-29 Yalun Dai , Yangyu Huang , Tongshen Yang , Yonghan Wang , Xin Zhang , Wenshan Wu , Qihao Zhao , Hao Li , Yuanyuan Gao , Kim-Hui Yap , Scarlett Li

Zero-Scan Data Quality: Leveraging Table Format Metadata for Continuous Observability at Scale

Modern table formats such as Apache Iceberg compute and store metadata-commit timestamps, record counts, and column-level statistics such as null counts and value bounds at write time as part of file writing. These statistics serve query…

Databases · Computer Science 2026-05-29 Mohit Verma , Shantanu Rawat , Christian Bush , Sumedh Sakdeo , Lokesh Amarnath Ravindranathan , Dwarak Bakshi

RAFI -- A Ray/Work Forwarding Infrastructure for Data Parallel Multi-Node/Multi-GPU Computing

We present RaFI, a CUDA and MPI based software framework that simplifies the task of building GPU-enabled data-parallel software where rays or similar work items need to migrate between different GPUs. RaFI provides a simple interface for…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-29 Ingo Wald , Serkan Demirci , Alper Sahistan , Stefan Zellmann , Andrea Paris , Patrick Moran , Milan Jaros , Tatiana von Landesberger , Ugur Gudukbay , Valerio Pascucci

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

Mid-training has become an important stage in modern LLM development, using large-scale curated mixtures to strengthen capabilities before final post-training. Its data selection problem is distinct: the data are optimized under a…

Artificial Intelligence · Computer Science 2026-05-29 Haowen Wang , Yaxin Du , Jian Yang , Jiajun Wu , Shukai Liu , Yuxuan Zhang , Pingjie Wang , Siheng Chen , Tuney Zheng , Ming Zhou , Xianglong Liu

ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure

Scientific discovery is an inherently creative and uncertain process, requiring reasoning beyond the recall of known knowledge. While many benchmarks have been proposed to evaluate large language model (LLM) performance on deep research…

Artificial Intelligence · Computer Science 2026-05-29 A. J. Lew , Y. Cao , M. J. Buehler

mcp-proto-okn: Natural-language access to open scientific knowledge graphs through the Model Context Protocol

MCP Server Proto-OKN (mcp-proto-okn) is a Python-based Model Context Protocol server that enables AI assistants to discover, inspect, query and integrate scientific knowledge graphs through natural language. The server provides graph…

Artificial Intelligence · Computer Science 2026-05-29 Peter W. Rose , Benjamin M. Good , Amanda M. Saravia-Butler , Charlotte A. Nelson , James P. Balhoff , Yaphet Kebede , Patricia L. Whetzel , Christopher Bizon , Andrew I. Su , Sergio E. Baranzini

When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

Long-horizon interactions require language models to manage accumulating information: when to update their state, when to preserve their state, and what to ignore. We study this challenge as \textbf{Contextual Belief Management (CBM)}:…

Artificial Intelligence · Computer Science 2026-05-29 Haoming Xu , Weihong Xu , Zongrui Li , Mengru Wang , Yunzhi Yao , Chiyu Wu , Jin Shang , Yu Gong , Shumin Deng

Persona Conditioning of Brand Recommendations in Retrieval-Augmented Commercial Chat: A Prominence-Stratified Cross-Provider Audit

The same prompt -- "best CRM software" -- reaches AI assistants from buyers in widely different contexts: a solo founder, an enterprise VP, a UK SMB owner. We audit how strongly that contextual variation reshapes which brands the model…

Artificial Intelligence · Computer Science 2026-05-29 Will Jack , Noah Lehman , Keller Maloney , Sarah Xu

Double-Edged Sword or Sharp Tool? Designing and Evaluating Triadic LLM-Teacher Collaboration for K-12 Writing at Scale

The double-edged sword of integrating Large Language Models (LLMs) requires an effective triadic collaboration mechanism among LLMs, teachers and students, especially for K-12 education. By developing a triadic collaboration system to…

Artificial Intelligence · Computer Science 2026-05-29 Canran Wang , Yuwen Yang , Zhen Wang , Ming Ma , Ding Yu , Chentai Wang , Keman Huang , Xiaoyong Du

Modularizing Educational LLM-Agency for Fostering Responsible Learning Assistance

The widespread adoption of AI chatbots in education will drastically change learning, making responsible deployment a critical concern. While large language models (LLMs) might have access to sources discussing insights from educational…

Artificial Intelligence · Computer Science 2026-05-29 Julius Gabelmann , Felix Jahn , Kevin Baum , Sophie van Rossum , Emely Wuenscher , Timo P. Gros , Verena Wolf

BioRefusalAudit: Auditing Biosecurity Refusal Depth Using General and Domain-Fine-Tuned Sparse Autoencoders

Biosecurity evaluations of language models typically ask whether models produce hazardous output. This paper asks a complementary question: when a model refuses, is that refusal structurally sound, or does it disappear under modest changes…

Artificial Intelligence · Computer Science 2026-05-29 Caleb DeLeeuw

Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents

Memory-augmented LLM agents tackle complex long-horizon tasks by recursively summarizing interaction trajectories into compact memory. However, existing approaches typically train these memory policies using outcome-based reinforcement…

Artificial Intelligence · Computer Science 2026-05-29 Ziyan Liu , Zhezheng Hao , Yeqiu Chen , Hong Wang , Jingren Hou , Ruiyi Ding , Yongkang Yang , Wence Ji , Wei Xia , Feng Liu

The Missing Dimensions in Geo-Distributed Database Evaluation

Geo-distributed OLTP databases are widely deployed across cloud regions, yet current evaluation practices do not cover the challenges of this aspect. Existing benchmarks assume stable network conditions; they lack explicit settings for data…

Databases · Computer Science 2026-05-29 Oto Mraz , Kyriakos Psarakis , George Christodoulou , Paris Carbone , Asterios Katsifodimos

Temporal Stability and Few-Shot Prompting in Math Task Assessment

As AI tools become increasingly integrated into educational contexts, questions arise about both their stability over time and their responsiveness to prompt engineering techniques. This longitudinal study focused on different AI tools'…

Artificial Intelligence · Computer Science 2026-05-29 Danielle S. Fox , Brenda L. Robles , Elizabeth DiPietro Brovey , Christian D. Schunn

Anchorless Diversification for Parallel LLM Ideation

LLMs are increasingly used to generate candidate-idea pools for creative tasks where broad exploration is valuable. Parallel inference can be attractive in this setting when it broadens the pool while retaining quality and cost efficiency.…

Artificial Intelligence · Computer Science 2026-05-29 Fares Nabil Ibrahim , Nafis Saami Azad , Raiyan Abdul Baten

AgentSchool: An LLM-Powered Multi-Agent Simulation for Education

Despite the rapid deployment of LLMs into classrooms, validating educational AI remains uniquely intractable: interventions act on developing learners whose cognitive and social trajectories are irreversibly shaped, while real-world trials…

Artificial Intelligence · Computer Science 2026-05-29 Yulei Ye , Wenhao Li , Zhong Wen , Yunshu Huang , Yichen Hu , Zifan Wei , Yige Wang , Xinyu Xie , Haoxuan Yang , Yanjun Huang , Ruijia Li , Hong Qian , Yu Song , Bo Jiang , Bingdong Li , Lijun Li , Bo Zhang , Pinlong Cai , Xingcheng Xu , Shuangye Chen , Xia Hu , Liang He , Aimin Zhou , Jingjing Qu , Jing Shao , Xiangfeng Wang