Related papers: T5QL: Taming language models for SQL generation

Using LLM to select the right SQL Query from candidates

Text-to-SQL models can generate a list of candidate SQL queries, and the best query is often in the candidate list, but not at the top of the list. An effective re-rank method can select the right SQL query from the candidate list and…

Computation and Language · Computer Science 2024-01-05 Zhenwen Li , Tao Xie

End-to-End Text-to-SQL with Dataset Selection: Leveraging LLMs for Adaptive Query Generation

Text-to-SQL bridges the gap between natural language and structured database language, thus allowing non-technical users to easily query databases. Traditional approaches model text-to-SQL as a direct translation task, where a given Natural…

Machine Learning · Computer Science 2025-08-12 Anurag Tripathi , Vaibhav Patle , Abhinav Jain , Ayush Pundir , Sairam Menon , Ajeet Kumar Singh , Dorien Herremans

RSL-SQL: Robust Schema Linking in Text-to-SQL Generation

Text-to-SQL generation aims to translate natural language questions into SQL statements. In Text-to-SQL based on large language models, schema linking is a widely adopted strategy to streamline the input for LLMs by selecting only relevant…

Computation and Language · Computer Science 2024-11-27 Zhenbiao Cao , Yuanlei Zheng , Zhihao Fan , Xiaojin Zhang , Wei Chen , Xiang Bai

Enhancing LLM Fine-tuning for Text-to-SQLs by SQL Quality Measurement

Text-to-SQLs enables non-expert users to effortlessly retrieve desired information from relational databases using natural language queries. While recent advancements, particularly with Large Language Models (LLMs) like GPT and T5, have…

Databases · Computer Science 2024-10-04 Shouvon Sarker , Xishuang Dong , Xiangfang Li , Lijun Qian

Towards Small Language Models for Security Query Generation in SOC Workflows

Analysts in Security Operations Centers routinely query massive telemetry streams using Kusto Query Language (KQL). Writing correct KQL requires specialized expertise, and this dependency creates a bottleneck as security teams scale. This…

Cryptography and Security · Computer Science 2026-02-27 Saleha Muzammil , Rahul Reddy , Vishal Kamalakrishnan , Hadi Ahmadi , Wajih Ul Hassan

PET-SQL: A Prompt-Enhanced Two-Round Refinement of Text-to-SQL with Cross-consistency

Recent advancements in Text-to-SQL (Text2SQL) emphasize stimulating the large language models (LLM) on in-context learning, achieving significant results. Nevertheless, they face challenges when dealing with verbose database information and…

Computation and Language · Computer Science 2024-06-04 Zhishuai Li , Xiang Wang , Jingjing Zhao , Sun Yang , Guoqing Du , Xiaoru Hu , Bin Zhang , Yuxiao Ye , Ziyue Li , Rui Zhao , Hangyu Mao

Fine-Tuning Language Models for Context-Specific SQL Query Generation

The ability to generate SQL queries from natural language has significant implications for making data accessible to non-specialists. This paper presents a novel approach to fine-tuning open-source large language models (LLMs) for the task…

Databases · Computer Science 2023-12-06 Amine Rebei

SeqGenSQL -- A Robust Sequence Generation Model for Structured Query Language

We explore using T5 (Raffel et al. (2019)) to directly translate natural language questions into SQL statements. General purpose natural language that interfaces to information stored within databases requires flexibly translating natural…

Artificial Intelligence · Computer Science 2020-11-10 Ning Li , Bethany Keller , Mark Butler , Daniel Cer

Towards Optimizing SQL Generation via LLM Routing

Text-to-SQL enables users to interact with databases through natural language, simplifying access to structured data. Although highly capable large language models (LLMs) achieve strong accuracy for complex queries, they incur unnecessary…

Databases · Computer Science 2024-11-08 Mohammadhossein Malekpour , Nour Shaheen , Foutse Khomh , Amine Mhedhbi

Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL

Generating accurate SQL from users' natural language questions (text-to-SQL) remains a long-standing challenge due to the complexities involved in user question understanding, database schema comprehension, and SQL generation. Traditional…

Computation and Language · Computer Science 2025-11-25 Zijin Hong , Zheng Yuan , Qinggang Zhang , Hao Chen , Junnan Dong , Feiran Huang , Xiao Huang

SEMA-SQL: Beyond Traditional Relational Querying with Large Language Models

Relational databases excel at structured data analysis, but real-world queries increasingly require capabilities beyond standard SQL, such as semantically matching entities across inconsistent names, extracting information not explicitly…

Databases · Computer Science 2026-05-15 Yin Lin , Tianjing Zeng , Zhongjun Ding , Rong Zhu , Bolin Ding , H. V. Jagadish , Jingren Zhou

Semantic Parsing with Syntax- and Table-Aware SQL Generation

We present a generative model to map natural language questions into SQL queries. Existing neural network based approaches typically generate a SQL query word-by-word, however, a large portion of the generated results are incorrect or not…

Computation and Language · Computer Science 2018-04-24 Yibo Sun , Duyu Tang , Nan Duan , Jianshu Ji , Guihong Cao , Xiaocheng Feng , Bing Qin , Ting Liu , Ming Zhou

Large Language Model Enhanced Text-to-SQL Generation: A Survey

Text-to-SQL translates natural language queries into Structured Query Language (SQL) commands, enabling users to interact with databases using natural language. Essentially, the text-to-SQL task is a text generation task, and its…

Databases · Computer Science 2024-10-10 Xiaohu Zhu , Qian Li , Lizhen Cui , Yongkang Liu

MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation

Text-to-SQL generation enables non-experts to interact with databases via natural language. Recent advances rely on large closed-source models like GPT-4 that present challenges in accessibility, privacy, and latency. To address these…

Computation and Language · Computer Science 2025-02-18 Satya Krishna Gorti , Ilan Gofman , Zhaoyan Liu , Jiapeng Wu , Noël Vouitsis , Guangwei Yu , Jesse C. Cresswell , Rasa Hosseinzadeh

Generating Realistic Tabular Data with Large Language Models

While most generative models show achievements in image data generation, few are developed for tabular data generation. Recently, due to success of large language models (LLM) in diverse tasks, they have also been used for tabular data…

Machine Learning · Computer Science 2024-10-30 Dang Nguyen , Sunil Gupta , Kien Do , Thin Nguyen , Svetha Venkatesh

SLM-SQL: An Exploration of Small Language Models for Text-to-SQL

Large language models (LLMs) have demonstrated strong performance in translating natural language questions into SQL queries (Text-to-SQL). In contrast, small language models (SLMs) ranging from 0.5B to 1.5B parameters currently…

Computation and Language · Computer Science 2025-07-31 Lei Sheng , Shuai-Shuai Xu

Agentic LLMs for Question Answering over Tabular Data

Question Answering over Tabular Data (Table QA) presents unique challenges due to the diverse structure, size, and data types of real-world tables. The SemEval 2025 Task 8 (DataBench) introduced a benchmark composed of large-scale,…

Computation and Language · Computer Science 2025-09-12 Rishit Tyagi , Mohit Gupta , Rahul Bouri

Structure Guided Large Language Model for SQL Generation

Recent advancements in large language models (LLMs) have shown promise in bridging the gap between natural language queries and database management systems, enabling users to interact with databases without the background of SQL. However,…

Databases · Computer Science 2025-07-11 Qinggang Zhang , Hao Chen , Junnan Dong , Shengyuan Chen , Feiran Huang , Xiao Huang

A Multi-agent Text2SQL Framework using Small Language Models and Execution Feedback

Text2SQL, the task of generating SQL queries from natural language text, is a critical challenge in data engineering. Recently, Large Language Models (LLMs) have demonstrated superior performance for this task due to their advanced…

Databases · Computer Science 2025-12-23 Thanh Dat Hoang , Thanh Trung Huynh , Matthias Weidlich , Thanh Tam Nguyen , Tong Chen , Hongzhi Yin , Quoc Viet Hung Nguyen

Semantic Parsing for Complex Data Retrieval: Targeting Query Plans vs. SQL for No-Code Access to Relational Databases

Large Language Models (LLMs) have spurred progress in text-to-SQL, the task of generating SQL queries from natural language questions based on a given database schema. Despite the declarative nature of SQL, it continues to be a complex…

Computation and Language · Computer Science 2023-12-25 Ben Eyal , Amir Bachar , Ophir Haroche , Michael Elhadad