English

Automating Program Structure Classification

Computers and Society 2021-01-26 v1 Machine Learning

Abstract

When students write programs, their program structure provides insight into their learning process. However, analyzing program structure by hand is time-consuming, and teachers need better tools for computer-assisted exploration of student solutions. As a first step towards an education-oriented program analysis toolkit, we show how supervised machine learning methods can automatically classify student programs into a predetermined set of high-level structures. We evaluate two models on classifying student solutions to the Rainfall problem: a nearest-neighbors classifier using syntax tree edit distance and a recurrent neural network. We demonstrate that these models can achieve 91% classification accuracy when trained on 108 programs. We further explore the generality, trade-offs, and failure cases of each model.

Keywords

Cite

@article{arxiv.2101.10087,
  title  = {Automating Program Structure Classification},
  author = {Will Crichton and Georgia Gabriela Sampaio and Pat Hanrahan},
  journal= {arXiv preprint arXiv:2101.10087},
  year   = {2021}
}

Comments

To appear at SIGCSE 2021