Computer Science

Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software

Are AI agents tools, co-authors, or researchers? We present a quantified case study ($N=1$): a physicist supervising an AI coding agent (Claude Code, Sonnet and Opus models) over 12 work days and 57 sessions to build CLAX-PT, a…

Artificial Intelligence · Computer Science 2026-05-29 Nhat-Minh Nguyen

SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations

Printed circuit board (PCB) schematic design defines nearly all electronic hardware, but it remains manual and expertise-intensive. While generative AI has advanced digital and analog IC design, PCB schematic generation from…

Artificial Intelligence · Computer Science 2026-05-29 Qinpei Luo , Ruichun Ma , Xinyu Zhang , Lili Qiu

Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection

Recent advances in Vision-Language Models (VLMs) have achieved impressive performance across many tasks, yet prior studies report unsatisfactory performance when applying large language or multimodal models to finding abnormal patterns in…

Artificial Intelligence · Computer Science 2026-05-29 Xiaona Zhou , Muntasir Wahed , Tianjiao Yu , Constantin Brif , Ismini Lourentzou

Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents

Multi-component LLM agents assemble probabilistic claims from components that each see only part of a joint problem; the composition can violate basic probability axioms even when every component is locally coherent. We formalise this…

Artificial Intelligence · Computer Science 2026-05-29 Anany Kotawala

Demystifying Data Organization for Enhanced LLM Training

Large Language Models (LLMs) have revolutionized various fields, yet their training efficiency is heavily reliant on effective data curation. While data selection has been widely studied, the strategic data organization for enhanced…

Artificial Intelligence · Computer Science 2026-05-29 Yalun Dai , Yangyu Huang , Tongshen Yang , Yonghan Wang , Xin Zhang , Wenshan Wu , Qihao Zhao , Hao Li , Yuanyuan Gao , Kim-Hui Yap , Scarlett Li

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

Mid-training has become an important stage in modern LLM development, using large-scale curated mixtures to strengthen capabilities before final post-training. Its data selection problem is distinct: the data are optimized under a…

Artificial Intelligence · Computer Science 2026-05-29 Haowen Wang , Yaxin Du , Jian Yang , Jiajun Wu , Shukai Liu , Yuxuan Zhang , Pingjie Wang , Siheng Chen , Tuney Zheng , Ming Zhou , Xianglong Liu

ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure

Scientific discovery is an inherently creative and uncertain process, requiring reasoning beyond the recall of known knowledge. While many benchmarks have been proposed to evaluate large language model (LLM) performance on deep research…

Artificial Intelligence · Computer Science 2026-05-29 A. J. Lew , Y. Cao , M. J. Buehler

mcp-proto-okn: Natural-language access to open scientific knowledge graphs through the Model Context Protocol

MCP Server Proto-OKN (mcp-proto-okn) is a Python-based Model Context Protocol server that enables AI assistants to discover, inspect, query and integrate scientific knowledge graphs through natural language. The server provides graph…

Artificial Intelligence · Computer Science 2026-05-29 Peter W. Rose , Benjamin M. Good , Amanda M. Saravia-Butler , Charlotte A. Nelson , James P. Balhoff , Yaphet Kebede , Patricia L. Whetzel , Christopher Bizon , Andrew I. Su , Sergio E. Baranzini

LLUMI: Improving LLM Writing Assistance for Mental Health Support with Online Community Feedback

Large language models (LLMs) show promise in generating supportive responses for mental health queries, but improving their usefulness, empathy, and safety often requires substantial compute, expert input, and labeled data. At the same…

Human-Computer Interaction · Computer Science 2026-05-29 Jiwon Kim , Maya Ajit , Sherry Gong , Soorya Ram Shimgekar , Dong Whi Yoo , Eshwar Chandrasekharan , Koustuv Saha

When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

Long-horizon interactions require language models to manage accumulating information: when to update their state, when to preserve their state, and what to ignore. We study this challenge as \textbf{Contextual Belief Management (CBM)}:…

Artificial Intelligence · Computer Science 2026-05-29 Haoming Xu , Weihong Xu , Zongrui Li , Mengru Wang , Yunzhi Yao , Chiyu Wu , Jin Shang , Yu Gong , Shumin Deng

Persona Conditioning of Brand Recommendations in Retrieval-Augmented Commercial Chat: A Prominence-Stratified Cross-Provider Audit

The same prompt -- "best CRM software" -- reaches AI assistants from buyers in widely different contexts: a solo founder, an enterprise VP, a UK SMB owner. We audit how strongly that contextual variation reshapes which brands the model…

Artificial Intelligence · Computer Science 2026-05-29 Will Jack , Noah Lehman , Keller Maloney , Sarah Xu

Double-Edged Sword or Sharp Tool? Designing and Evaluating Triadic LLM-Teacher Collaboration for K-12 Writing at Scale

The double-edged sword of integrating Large Language Models (LLMs) requires an effective triadic collaboration mechanism among LLMs, teachers and students, especially for K-12 education. By developing a triadic collaboration system to…

Artificial Intelligence · Computer Science 2026-05-29 Canran Wang , Yuwen Yang , Zhen Wang , Ming Ma , Ding Yu , Chentai Wang , Keman Huang , Xiaoyong Du

Modularizing Educational LLM-Agency for Fostering Responsible Learning Assistance

The widespread adoption of AI chatbots in education will drastically change learning, making responsible deployment a critical concern. While large language models (LLMs) might have access to sources discussing insights from educational…

Artificial Intelligence · Computer Science 2026-05-29 Julius Gabelmann , Felix Jahn , Kevin Baum , Sophie van Rossum , Emely Wuenscher , Timo P. Gros , Verena Wolf

BioRefusalAudit: Auditing Biosecurity Refusal Depth Using General and Domain-Fine-Tuned Sparse Autoencoders

Biosecurity evaluations of language models typically ask whether models produce hazardous output. This paper asks a complementary question: when a model refuses, is that refusal structurally sound, or does it disappear under modest changes…

Artificial Intelligence · Computer Science 2026-05-29 Caleb DeLeeuw

Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents

Memory-augmented LLM agents tackle complex long-horizon tasks by recursively summarizing interaction trajectories into compact memory. However, existing approaches typically train these memory policies using outcome-based reinforcement…

Artificial Intelligence · Computer Science 2026-05-29 Ziyan Liu , Zhezheng Hao , Yeqiu Chen , Hong Wang , Jingren Hou , Ruiyi Ding , Yongkang Yang , Wence Ji , Wei Xia , Feng Liu

Temporal Stability and Few-Shot Prompting in Math Task Assessment

As AI tools become increasingly integrated into educational contexts, questions arise about both their stability over time and their responsiveness to prompt engineering techniques. This longitudinal study focused on different AI tools'…

Artificial Intelligence · Computer Science 2026-05-29 Danielle S. Fox , Brenda L. Robles , Elizabeth DiPietro Brovey , Christian D. Schunn

Anchorless Diversification for Parallel LLM Ideation

LLMs are increasingly used to generate candidate-idea pools for creative tasks where broad exploration is valuable. Parallel inference can be attractive in this setting when it broadens the pool while retaining quality and cost efficiency.…

Artificial Intelligence · Computer Science 2026-05-29 Fares Nabil Ibrahim , Nafis Saami Azad , Raiyan Abdul Baten

AgentSchool: An LLM-Powered Multi-Agent Simulation for Education

Despite the rapid deployment of LLMs into classrooms, validating educational AI remains uniquely intractable: interventions act on developing learners whose cognitive and social trajectories are irreversibly shaped, while real-world trials…

Artificial Intelligence · Computer Science 2026-05-29 Yulei Ye , Wenhao Li , Zhong Wen , Yunshu Huang , Yichen Hu , Zifan Wei , Yige Wang , Xinyu Xie , Haoxuan Yang , Yanjun Huang , Ruijia Li , Hong Qian , Yu Song , Bo Jiang , Bingdong Li , Lijun Li , Bo Zhang , Pinlong Cai , Xingcheng Xu , Shuangye Chen , Xia Hu , Liang He , Aimin Zhou , Jingjing Qu , Jing Shao , Xiangfeng Wang

Enhancing Multi-Agent Communication through Attention Steering with Context Relevance

LLM-based multi-agent systems have demonstrated remarkable performance on complex tasks through collaborative reasoning. However, these systems tend to rapidly accumulate extremely long conversation histories during interaction. As…

Artificial Intelligence · Computer Science 2026-05-29 Hongxiang Zhang , Yuan Tian , Tianyi Zhang

REACT: A Conditioning Framework for User-Adaptive sEMG Hand Pose Estimation

Surface electromyography (sEMG) enables continuous hand pose estimation on wearable devices, but models trained on multi-user corpora degrade on unseen individuals due to inter-user variability in anatomy and electrode placement. We propose…

Human-Computer Interaction · Computer Science 2026-05-29 Eric Xie , Hei Shing Cheung