English

RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL

Computation and Language 2024-07-15 v2

Abstract

Large language models (LLMs) with in-context learning have significantly improved the performance of text-to-SQL task. Previous works generally focus on using exclusive SQL generation prompt to improve the LLMs' reasoning ability. However, they are mostly hard to handle large databases with numerous tables and columns, and usually ignore the significance of pre-processing database and extracting valuable information for more efficient prompt engineering. Based on above analysis, we propose RB-SQL, a novel retrieval-based LLM framework for in-context prompt engineering, which consists of three modules that retrieve concise tables and columns as schema, and targeted examples for in-context learning. Experiment results demonstrate that our model achieves better performance than several competitive baselines on public datasets BIRD and Spider.

Keywords

Cite

@article{arxiv.2407.08273,
  title  = {RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL},
  author = {Zhenhe Wu and Zhongqiu Li and Jie Zhang and Mengxiang Li and Yu Zhao and Ruiyu Fang and Zhongjiang He and Xuelong Li and Zhoujun Li and Shuangyong Song},
  journal= {arXiv preprint arXiv:2407.08273},
  year   = {2024}
}

Comments

Further improvement and modification are needed.