English
Related papers

Related papers: Rethinking Scientific Modeling: Toward Physically …

200 papers

Cyber-Physical Systems (CPS) produce behavior through execution on substrates coupling computation with physical processes. However, usual engineering approaches do not treat execution semantics as first-class engineering entities. Formal…

Software Engineering · Computer Science 2026-04-16 Alexandre Muzy

Large Language Models (LLMs) are increasingly entering specialized, safety-critical engineering workflows governed by strict quantitative standards and immutable physical laws, making rigorous evaluation of their reasoning capabilities…

Computation and Language · Computer Science 2026-01-08 Ayesha Gull , Muhammad Usman Safder , Rania Elbadry , Fan Zhang , Veselin Stoyanov , Preslav Nakov , Zhuohan Xie

As Large Language Models (LLMs) become integral to software development workflows, their ability to generate structured outputs has become critically important. We introduce StructEval, a comprehensive benchmark for evaluating LLMs'…

Machine learning has emerged as a significant approach to efficiently tackle electronic structure problems. Despite its potential, there is less guarantee for the model to generalize to unseen data that hinders its application in real-world…

Machine Learning · Computer Science 2024-02-16 Gengyuan Hu , Gengchen Wei , Zekun Lou , Philip H. S. Torr , Wanli Ouyang , Han-sen Zhong , Chen Lin

Complex systems can be modelled at various levels of detail. Ideally, causal models of the same system should be consistent with one another in the sense that they agree in their predictions of the effects of interventions. We formalise…

The opportunities offered by LLM coders (and their current limitations) demand a reevaluation of how software is structured. Software today is often "illegible" - lacking a direct correspondence between code and observed behavior - and…

Software Engineering · Computer Science 2025-08-29 Eagon Meng , Daniel Jackson

Predictive benchmarking, the evaluation of machine learning models based on predictive performance and competitive ranking, is a central epistemic practice in machine learning research and an increasingly prominent method for scientific…

Machine Learning · Computer Science 2025-10-28 Timo Freiesleben , Sebastian Zezulka

This paper presents the development of a documented program capable of solving idealized beam models, such as those commonly used in textbooks and academic exercises, from drawings made by a person. The system is based on computer vision…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Altamirano-Muñiz Emilio Fernando

The ability of Large Language Models (LLMs) to precisely follow complex and fine-grained lexical instructions is a cornerstone of their utility and controllability. However, evaluating this capability remains a significant challenge.…

Computation and Language · Computer Science 2026-03-24 Huimin Ren , Yan Liang , Baiqiao Su , Chaobo Sun , Hengtong Lu , Kaike Zhang , Chen Wei

The rapid advancement of large language models (LLMs) demands robust, unbiased, and scalable evaluation methods. However, human annotations are costly to scale, model-based evaluations are susceptible to stylistic biases, and…

Model-driven engineering is the automatic production of software artefacts from abstract models of structure and functionality. By targeting a specific class of system, it is possible to automate aspects of the development process, using…

Software Engineering · Computer Science 2013-01-03 Chen-Wei Wang , Jim Davies

Recent progress in Large Language Models (LLMs) has substantially advanced the automation of software engineering (SE) tasks, enabling complex activities such as code generation and code summarization. However, the black-box nature of LLMs…

Software Engineering · Computer Science 2025-12-24 Antonio Vitale , Khai-Nguyen Nguyen , Denys Poshyvanyk , Rocco Oliveto , Simone Scalabrino , Antonio Mastropaolo

Building Information Modeling (BIM) produces three-dimensional models of buildings combining the geometrical information with a wide range of properties. BIM is slowly but inevitably revolutionizing the architecture, engineering, and…

Logic in Computer Science · Computer Science 2022-05-19 Joaquín Arias , Seppo Törmä , Manuel Carro , Gopal Gupta

Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains, including software development, education, and technical assistance. Among these, software development is one of the key areas where LLMs are…

Computation and Language · Computer Science 2026-01-07 Inpyo Song , Eunji Jeon , Jangwon Lee

As LLMs advance their reasoning capabilities about the physical world, the absence of rigorous benchmarks for evaluating their ability to generate scientifically valid physical models has become a critical gap. Computational mechanics,…

Machine Learning · Computer Science 2025-12-25 Saeed Mohammadzadeh , Erfan Hamdi , Joel Shor , Emma Lejeune

Building precise simulations of the real world and invoking numerical solvers to answer quantitative problems is an essential requirement in engineering and science. We present FEABench, a benchmark to evaluate the ability of large language…

Artificial Intelligence · Computer Science 2025-04-09 Nayantara Mudur , Hao Cui , Subhashini Venugopalan , Paul Raccuglia , Michael P. Brenner , Peter Norgaard

While Large Language Models (LLMs) demonstrate impressive performance in mathematics, existing math benchmarks come with significant limitations. Many focus on problems with fixed ground-truth answers, and are often saturated due to problem…

Artificial Intelligence · Computer Science 2025-10-02 Mislav Balunović , Jasper Dekoninck , Nikola Jovanović , Ivo Petrov , Martin Vechev

Large language models (LLMs) have exhibited remarkable capabilities across diverse open-domain tasks, yet their application in specialized domains such as civil engineering remains largely unexplored. This paper starts bridging this gap by…

Computation and Language · Computer Science 2025-07-08 Jiachen Liu , Ziheng Geng , Ran Cao , Lu Cheng , Paolo Bocchini , Minghui Cheng

Medical large language models (LLMs) research often makes bold claims, from encoding clinical knowledge to reasoning like a physician. These claims are usually backed by evaluation on competitive benchmarks; a tradition inherited from…

Computation and Language · Computer Science 2025-03-17 Ahmed Alaa , Thomas Hartvigsen , Niloufar Golchini , Shiladitya Dutta , Frances Dean , Inioluwa Deborah Raji , Travis Zack

Large language model (LLM) simulations of human behavior have the potential to revolutionize the social and behavioral sciences, if and only if they faithfully reflect real human behaviors. Current evaluations of simulation fidelity are…

Computation and Language · Computer Science 2026-04-14 Tiancheng Hu , Joachim Baumann , Lorenzo Lupo , Nigel Collier , Dirk Hovy , Paul Röttger
‹ Prev 1 2 3 10 Next ›