Related papers: CodeS: Towards Building Open-source Language Model…

Solid-SQL: Enhanced Schema-linking based In-context Learning for Robust Text-to-SQL

Recently, large language models (LLMs) have significantly improved the performance of text-to-SQL systems. Nevertheless, many state-of-the-art (SOTA) approaches have overlooked the critical aspect of system robustness. Our experiments…

Computation and Language · Computer Science 2024-12-18 Geling Liu , Yunzhi Tan , Ruichao Zhong , Yuanzhen Xie , Lingchen Zhao , Qian Wang , Bo Hu , Zang Li

DataGpt-SQL-7B: An Open-Source Language Model for Text-to-SQL

In addressing the pivotal role of translating natural language queries into SQL commands, we propose a suite of compact, fine-tuned models and self-refine mechanisms to democratize data access and analysis for non-expert users, mitigating…

Artificial Intelligence · Computer Science 2024-09-25 Lixia Wu , Peng Li , Junhong Lou , Lei Fu

Enhancing LLM Fine-tuning for Text-to-SQLs by SQL Quality Measurement

Text-to-SQLs enables non-expert users to effortlessly retrieve desired information from relational databases using natural language queries. While recent advancements, particularly with Large Language Models (LLMs) like GPT and T5, have…

Databases · Computer Science 2024-10-04 Shouvon Sarker , Xishuang Dong , Xiangfang Li , Lijun Qian

Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs

Text-to-SQL parsing, which aims at converting natural language instructions into executable SQLs, has gained increasing attention in recent years. In particular, Codex and ChatGPT have shown impressive results in this task. However, most of…

Computation and Language · Computer Science 2023-11-16 Jinyang Li , Binyuan Hui , Ge Qu , Jiaxi Yang , Binhua Li , Bowen Li , Bailin Wang , Bowen Qin , Rongyu Cao , Ruiying Geng , Nan Huo , Xuanhe Zhou , Chenhao Ma , Guoliang Li , Kevin C. C. Chang , Fei Huang , Reynold Cheng , Yongbin Li

Evaluating the Text-to-SQL Capabilities of Large Language Models

We perform an empirical evaluation of Text-to-SQL capabilities of the Codex language model. We find that, without any finetuning, Codex is a strong baseline on the Spider benchmark; we also analyze the failure modes of Codex in this…

Computation and Language · Computer Science 2022-04-04 Nitarshan Rajkumar , Raymond Li , Dzmitry Bahdanau

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL task. However, the absence of a systematical benchmark inhibits the development of designing effective, efficient and economic LLM-based Text-to-SQL solutions. To…

Databases · Computer Science 2023-11-21 Dawei Gao , Haibin Wang , Yaliang Li , Xiuyu Sun , Yichen Qian , Bolin Ding , Jingren Zhou

Open-SQL Framework: Enhancing Text-to-SQL on Open-source Large Language Models

Despite the success of large language models (LLMs) in Text-to-SQL tasks, open-source LLMs encounter challenges in contextual understanding and response coherence. To tackle these issues, we present \ours, a systematic methodology tailored…

Computation and Language · Computer Science 2024-05-14 Xiaojun Chen , Tianle Wang , Tianhao Qiu , Jianbin Qin , Min Yang

BASE-SQL: A powerful open source Text-To-SQL baseline approach

The conversion of natural language into SQL language for querying databases (Text-to-SQL) has broad application prospects and has attracted widespread attention. At present, the mainstream Text-to-SQL methods are mainly divided into…

Computation and Language · Computer Science 2025-02-18 Lei Sheng , Shuai-Shuai Xu , Wei Xie

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Real-world enterprise text-to-SQL workflows often involve complex cloud or local data across various database systems, multiple SQL queries in various dialects, and diverse operations from data transformation to analytics. We introduce…

Computation and Language · Computer Science 2025-03-18 Fangyu Lei , Jixuan Chen , Yuxiao Ye , Ruisheng Cao , Dongchan Shin , Hongjin Su , Zhaoqing Suo , Hongcheng Gao , Wenjing Hu , Pengcheng Yin , Victor Zhong , Caiming Xiong , Ruoxi Sun , Qian Liu , Sida Wang , Tao Yu

Text-to-SQL Error Correction with Language Models of Code

Despite recent progress in text-to-SQL parsing, current semantic parsers are still not accurate enough for practical use. In this paper, we investigate how to build automatic text-to-SQL error correction models. Noticing that token-level…

Computation and Language · Computer Science 2023-05-30 Ziru Chen , Shijie Chen , Michael White , Raymond Mooney , Ali Payani , Jayanth Srinivasa , Yu Su , Huan Sun

Reboost Large Language Model-based Text-to-SQL, Text-to-Python, and Text-to-Function -- with Real Applications in Traffic Domain

The previous state-of-the-art (SOTA) method achieved a remarkable execution accuracy on the Spider dataset, which is one of the largest and most diverse datasets in the Text-to-SQL domain. However, during our reproduction of the business…

Artificial Intelligence · Computer Science 2023-11-01 Guanghu Sui , Zhishuai Li , Ziyue Li , Sun Yang , Jingqing Ruan , Hangyu Mao , Rui Zhao

A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges

Text-to-SQL systems facilitate smooth interaction with databases by translating natural language queries into Structured Query Language (SQL), bridging the gap between non-technical users and complex database management systems. This survey…

Artificial Intelligence · Computer Science 2025-01-24 Aditi Singh , Akash Shetty , Abul Ehtesham , Saket Kumar , Tala Talaei Khoei

Analyzing the Effectiveness of Large Language Models on Text-to-SQL Synthesis

This study investigates various approaches to using Large Language Models (LLMs) for Text-to-SQL program synthesis, focusing on the outcomes and insights derived. Employing the popular Text-to-SQL dataset, spider, the goal was to input a…

Artificial Intelligence · Computer Science 2024-01-24 Richard Roberson , Gowtham Kaki , Ashutosh Trivedi

Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness

Neural text-to-SQL models have achieved remarkable performance in translating natural language questions into SQL queries. However, recent studies reveal that text-to-SQL models are vulnerable to task-specific perturbations. Previous…

Computation and Language · Computer Science 2023-01-31 Shuaichen Chang , Jun Wang , Mingwen Dong , Lin Pan , Henghui Zhu , Alexander Hanbo Li , Wuwei Lan , Sheng Zhang , Jiarong Jiang , Joseph Lilien , Steve Ash , William Yang Wang , Zhiguo Wang , Vittorio Castelli , Patrick Ng , Bing Xiang

UNITE: A Unified Benchmark for Text-to-SQL Evaluation

A practical text-to-SQL system should generalize well on a wide variety of natural language questions, unseen database schemas, and novel SQL query structures. To comprehensively evaluate text-to-SQL systems, we introduce a UNIfied…

Computation and Language · Computer Science 2023-07-17 Wuwei Lan , Zhiguo Wang , Anuj Chauhan , Henghui Zhu , Alexander Li , Jiang Guo , Sheng Zhang , Chung-Wei Hang , Joseph Lilien , Yiqun Hu , Lin Pan , Mingwen Dong , Jun Wang , Jiarong Jiang , Stephen Ash , Vittorio Castelli , Patrick Ng , Bing Xiang

DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction

There is currently a significant gap between the performance of fine-tuned models and prompting approaches using Large Language Models (LLMs) on the challenging task of text-to-SQL, as evaluated on datasets such as Spider. To improve the…

Computation and Language · Computer Science 2023-11-06 Mohammadreza Pourreza , Davood Rafiei

MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation

Text-to-SQL generation enables non-experts to interact with databases via natural language. Recent advances rely on large closed-source models like GPT-4 that present challenges in accessibility, privacy, and latency. To address these…

Computation and Language · Computer Science 2025-02-18 Satya Krishna Gorti , Ilan Gofman , Zhaoyan Liu , Jiapeng Wu , Noël Vouitsis , Guangwei Yu , Jesse C. Cresswell , Rasa Hosseinzadeh

RSL-SQL: Robust Schema Linking in Text-to-SQL Generation

Text-to-SQL generation aims to translate natural language questions into SQL statements. In Text-to-SQL based on large language models, schema linking is a widely adopted strategy to streamline the input for LLMs by selecting only relevant…

Computation and Language · Computer Science 2024-11-27 Zhenbiao Cao , Yuanlei Zheng , Zhihao Fan , Xiaojin Zhang , Wei Chen , Xiang Bai

Multilingual Text-to-SQL: Benchmarking the Limits of Language Models with Collaborative Language Agents

Text-to-SQL enables natural access to databases, yet most benchmarks are English-only, limiting multilingual progress. We introduce MultiSpider 2.0, extending Spider 2.0 to eight languages (English, German, French, Spanish, Portuguese,…

Computation and Language · Computer Science 2025-09-30 Khanh Trinh Pham , Thu Huong Nguyen , Jun Jo , Quoc Viet Hung Nguyen , Thanh Tam Nguyen

SLM-SQL: An Exploration of Small Language Models for Text-to-SQL

Large language models (LLMs) have demonstrated strong performance in translating natural language questions into SQL queries (Text-to-SQL). In contrast, small language models (SLMs) ranging from 0.5B to 1.5B parameters currently…

Computation and Language · Computer Science 2025-07-31 Lei Sheng , Shuai-Shuai Xu