Software code generation using Large Language Models (LLMs) is one of the most successful applications of modern artificial intelligence. Foundational models are very effective for popular frameworks that benefit from documentation, examples, and strong community support. In contrast, specialized scientific libraries often lack these resources and may expose unstable APIs under active development, making it difficult for models trained on limited or outdated data. We address these issues for the Gammapy library by developing an agent capable of writing, executing, and validating code in a controlled environment. We present a minimal web demo and an accompanying benchmarking suite. This contribution summarizes the design, reports our current status, and outlines next steps.
@article{arxiv.2509.26110,
title = {Agent-based code generation for the Gammapy framework},
author = {Dmitriy Kostunin and Vladimir Sotnikov and Sergo Golovachev and Abhay Mehta and Tim Lukas Holch and Elisa Jones},
journal= {arXiv preprint arXiv:2509.26110},
year = {2025}
}