English

Refactoring Codebases through Library Design

Software Engineering 2025-10-07 v3 Artificial Intelligence

Abstract

Maintainable and general software allows developers to build robust applications efficiently, yet achieving these qualities often requires refactoring specialized solutions into reusable components. This challenge becomes particularly relevant as code agents become used to solve isolated one-off programming problems. We investigate code agents' capacity to refactor code in ways that support growth and reusability. We first investigate what makes a good refactoring, finding via simulation results and a human study that Minimum Description Length best correlates with preferable refactorings. We then present both a benchmark and a method for refactoring: MiniCode, a benchmark where multiple files must be refactored into a shared library, and Librarian, a sample-and-rerank method for generating reusable libraries. We compare Librarian to state-of-the-art library generation methods, and study it on real-world code bases.

Keywords

Cite

@article{arxiv.2506.11058,
  title  = {Refactoring Codebases through Library Design},
  author = {Ziga Kovacic and Justin T. Chiu and Celine Lee and Wenting Zhao and Kevin Ellis},
  journal= {arXiv preprint arXiv:2506.11058},
  year   = {2025}
}

Comments

29 pages

R2 v1 2026-07-01T03:14:15.725Z