Related papers: Creating a Trajectory for Code Writing: Algorithmi…

ART: Automatic multi-step reasoning and tool-use for large language models

Large language models (LLMs) can perform complex reasoning in few- and zero-shot settings by generating intermediate chain of thought (CoT) reasoning steps. Further, each reasoning step can rely on external tools to support computation…

Computation and Language · Computer Science 2023-03-17 Bhargavi Paranjape , Scott Lundberg , Sameer Singh , Hannaneh Hajishirzi , Luke Zettlemoyer , Marco Tulio Ribeiro

Integrating Natural Language Prompting Tasks in Introductory Programming Courses

Introductory programming courses often emphasize mastering syntax and basic constructs before progressing to more complex and interesting programs. This bottom-up approach can be frustrating for novices, shifting the focus away from problem…

Computers and Society · Computer Science 2024-10-07 Chris Kerslake , Paul Denny , David H Smith , James Prather , Juho Leinonen , Andrew Luxton-Reilly , Stephen MacNeil

AR-LSAT: Investigating Analytical Reasoning of Text

Analytical reasoning is an essential and challenging task that requires a system to analyze a scenario involving a set of particular circumstances and perform reasoning over it to make conclusions. In this paper, we study the challenge of…

Computation and Language · Computer Science 2021-04-16 Wanjun Zhong , Siyuan Wang , Duyu Tang , Zenan Xu , Daya Guo , Jiahai Wang , Jian Yin , Ming Zhou , Nan Duan

Between Tool and Trouble: Student Attitudes Toward AI in Programming Education

This study examines how AI code assistants shape novice programmers experiences during a two-part exam in an introductory programming course. In the first part, students completed a programming task with access to AI support; in the second,…

Emerging Technologies · Computer Science 2025-08-11 Sergio Rojas-Galeano , Julian Tejada , Fernando Marmolejo-Ramos

Exploring the Role of Tracing in AI-Supported Planning for Algorithmic Reasoning

AI-powered planning tools show promise in supporting programming learners by enabling early, formative feedback on their thinking processes prior to coding. To date, however, most AI-supported planning tools rely on students'…

Human-Computer Interaction · Computer Science 2026-02-04 Yoshee Jain , Heejin Do , Zihan Wu , April Yi Wang

Reasoning Steps as Curriculum: Using Depth of Thought as a Difficulty Signal for Tuning LLMs

Curriculum learning for training LLMs requires a difficulty signal that aligns with reasoning while remaining scalable and interpretable. We propose a simple premise: tasks that demand deeper depth of thought for humans should also be…

Machine Learning · Computer Science 2025-08-27 Jeesu Jung , Sangkeun Jung

Towards Effective Code-Integrated Reasoning

In this paper, we investigate code-integrated reasoning, where models generate code when necessary and integrate feedback by executing it through a code interpreter. To acquire this capability, models must learn when and how to use external…

Computation and Language · Computer Science 2025-06-02 Fei Bai , Yingqian Min , Beichen Zhang , Zhipeng Chen , Wayne Xin Zhao , Lei Fang , Zheng Liu , Zhongyuan Wang , Ji-Rong Wen

Demystifying Errors in LLM Reasoning Traces: An Empirical Study of Code Execution Simulation

Understanding a program's runtime reasoning behavior, meaning how intermediate states and control flows lead to final execution results, is essential for reliable code generation, debugging, and automated reasoning. Although large language…

Software Engineering · Computer Science 2025-12-02 Mohammad Abdollahi , Khandaker Rifah Tasnia , Soumit Kanti Saha , Jinqiu Yang , Song Wang , Hadi Hemmati

Exploring Programming Task Creation of Primary School Teachers in Training

Introducing computational thinking in primary school curricula implies that teachers have to prepare appropriate lesson material. Typically this includes creating programming tasks, which may overwhelm primary school teachers with lacking…

Computers and Society · Computer Science 2023-06-27 Luisa Greifenstein , Ute Heuer , Gordon Fraser

A survey on grading format of automated grading tools for programming assignments

The prevalence of online platforms and studies has generated the demand for automated grading tools, and as a result, there are plenty in the market. Such tools are developed to grade coding assignments quickly, accurately, and…

Computers and Society · Computer Science 2022-12-06 Aditi Agrawal , Benjamin Reed

Code Prompting: a Neural Symbolic Method for Complex Reasoning in Large Language Models

Large language models (LLMs) have scaled up to unlock a wide range of complex reasoning tasks with the aid of various prompting methods. However, current prompting methods generate natural language intermediate steps to help reasoning,…

Computation and Language · Computer Science 2023-10-10 Yi Hu , Haotong Yang , Zhouchen Lin , Muhan Zhang

Learning Guided Automated Reasoning: A Brief Survey

Automated theorem provers and formal proof assistants are general reasoning systems that are in theory capable of proving arbitrarily hard theorems, thus solving arbitrary problems reducible to mathematics and logical reasoning. In…

Artificial Intelligence · Computer Science 2025-06-23 Lasse Blaauwbroek , David Cerna , Thibault Gauthier , Jan Jakubův , Cezary Kaliszyk , Martin Suda , Josef Urban

Teaching Language Models to Reason with Tools

Large reasoning models (LRMs) like OpenAI-o1 have shown impressive capabilities in natural language reasoning. However, these models frequently demonstrate inefficiencies or inaccuracies when tackling complex mathematical operations. While…

Computation and Language · Computer Science 2025-10-24 Chengpeng Li , Zhengyang Tang , Ziniu Li , Mingfeng Xue , Keqin Bao , Tian Ding , Ruoyu Sun , Benyou Wang , Xiang Wang , Junyang Lin , Dayiheng Liu

Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions

Recent advancements in Large Language Models (LLMs) have showcased striking results on existing logical reasoning benchmarks, with some models even surpassing human performance. However, the true depth of their competencies and robustness…

Computation and Language · Computer Science 2024-11-05 Pengfei Hong , Navonil Majumder , Deepanway Ghosal , Somak Aditya , Rada Mihalcea , Soujanya Poria

Open-Book Neural Algorithmic Reasoning

Neural algorithmic reasoning is an emerging area of machine learning that focuses on building neural networks capable of solving complex algorithmic tasks. Recent advancements predominantly follow the standard supervised learning paradigm…

Machine Learning · Computer Science 2025-01-03 Hefei Li , Chao Peng , Chenyang Xu , Zhengfeng Yang

Analysing Mathematical Reasoning Abilities of Neural Models

Mathematical reasoning---a core ability within human intelligence---presents some unique challenges as a domain: we do not come to understand and solve mathematical problems primarily on the back of experience and evidence, but on the basis…

Machine Learning · Computer Science 2019-04-03 David Saxton , Edward Grefenstette , Felix Hill , Pushmeet Kohli

Knowledge Markers: An AI-Agnostic Concept for the Design of Programming Courses

Generative AI enables students to produce plausible code quickly. Producing working code is therefore no longer a reliable indicator of understanding. This is particularly problematic in non-computer-science programmes, where time…

Computers and Society · Computer Science 2026-04-09 Christina Maria Mayr

ART: Action-based Reasoning Task Benchmarking for Medical AI Agents

Reliable clinical decision support requires medical AI agents capable of safe, multi-step reasoning over structured electronic health records (EHRs). While large language models (LLMs) show promise in healthcare, existing benchmarks…

Artificial Intelligence · Computer Science 2026-01-15 Ananya Mantravadi , Shivali Dalmia , Abhishek Mukherji

CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis

Inductive program synthesis, or programming by example, requires synthesizing functions from input-output examples that generalize to unseen inputs. While large language model agents have shown promise in programming tasks guided by natural…

Programming Languages · Computer Science 2025-08-11 Anjiang Wei , Tarun Suresh , Jiannan Cao , Naveen Kannan , Yuheng Wu , Kai Yan , Thiago S. F. X. Teixeira , Ke Wang , Alex Aiken

Code Interviews: Design and Evaluation of a More Authentic Assessment for Introductory Programming Assignments

Generative artificial intelligence poses new challenges around assessment, increasingly driving introductory programming educators to employ invigilated exams. But exams do not afford more authentic programming experiences that involve…

Computers and Society · Computer Science 2024-11-19 Suhas Kannam , Yuri Yang , Aarya Dharm , Kevin Lin