English

GBV-SQL: Guided Generation and SQL2Text Back-Translation Validation for Multi-Agent Text2SQL

Artificial Intelligence 2025-09-17 v1

Abstract

While Large Language Models have significantly advanced Text2SQL generation, a critical semantic gap persists where syntactically valid queries often misinterpret user intent. To mitigate this challenge, we propose GBV-SQL, a novel multi-agent framework that introduces Guided Generation with SQL2Text Back-translation Validation. This mechanism uses a specialized agent to translate the generated SQL back into natural language, which verifies its logical alignment with the original question. Critically, our investigation reveals that current evaluation is undermined by a systemic issue: the poor quality of the benchmarks themselves. We introduce a formal typology for "Gold Errors", which are pervasive flaws in the ground-truth data, and demonstrate how they obscure true model performance. On the challenging BIRD benchmark, GBV-SQL achieves 63.23% execution accuracy, a 5.8% absolute improvement. After removing flawed examples, GBV-SQL achieves 96.5% (dev) and 97.6% (test) execution accuracy on the Spider benchmark. Our work offers both a robust framework for semantic validation and a critical perspective on benchmark integrity, highlighting the need for more rigorous dataset curation.

Keywords

Cite

@article{arxiv.2509.12612,
  title  = {GBV-SQL: Guided Generation and SQL2Text Back-Translation Validation for Multi-Agent Text2SQL},
  author = {Daojun Chen and Xi Wang and Shenyuan Ren and Qingzhi Ma and Pengpeng Zhao and An Liu},
  journal= {arXiv preprint arXiv:2509.12612},
  year   = {2025}
}
R2 v1 2026-07-01T05:38:17.764Z