English

Vision-Guided Iterative Refinement for Frontend Code Generation

Artificial Intelligence 2026-04-08 v1

Abstract

Code generation with large language models often relies on multi-stage human-in-the-loop refinement, which is effective but very costly - particularly in domains such as frontend web development where the solution quality depends on rendered visual output. We present a fully automated critic-in-the-loop framework in which a vision-language model serves as a visual critic that provides structured feedback on rendered webpages to guide iterative refinement of generated code. Across real-world user requests from the WebDev Arena dataset, this approach yields consistent improvements in solution quality, achieving up to 17.8% increase in performance over three refinement cycles. Next, we investigate parameter-efficient fine-tuning using LoRA to understand whether the improvements provided by the critic can be internalized by the code-generating LLM. Fine-tuning achieves 25% of the gains from the best critic-in-the-loop solution without a significant increase in token counts. Our findings indicate that automated, VLM-based critique of frontend code generation leads to significantly higher quality solutions than can be achieved through a single LLM inference pass, and highlight the importance of iterative refinement for the complex visual outputs associated with web development.

Keywords

Cite

@article{arxiv.2604.05839,
  title  = {Vision-Guided Iterative Refinement for Frontend Code Generation},
  author = {Hannah Sansford and Derek H. C. Law and Wei Liu and Abhishek Tripathi and Niresh Agarwal and Gerrit J. J. van den Burg},
  journal= {arXiv preprint arXiv:2604.05839},
  year   = {2026}
}

Comments

Accepted at ICLR 2026 Workshop on AI with Recursive Self-Improvement