Related papers: DocuT5: Seq2seq SQL Generation with Table Document…

Domain Adaptation of a State of the Art Text-to-SQL Model: Lessons Learned and Challenges Found

There are many recent advanced developments for the Text-to-SQL task, where the Picard model is one of the the top performing models as measured by the Spider dataset competition. However, bringing Text-to-SQL systems to realistic use-cases…

Computation and Language · Computer Science 2023-12-12 Irene Manotas , Octavian Popescu , Ngoc Phuoc An Vo , Vadim Sheinin

Exploring Underexplored Limitations of Cross-Domain Text-to-SQL Generalization

Recently, there has been significant progress in studying neural networks for translating text descriptions into SQL queries under the zero-shot cross-domain setting. Despite achieving good performance on some public benchmarks, we observe…

Computation and Language · Computer Science 2021-09-14 Yujian Gan , Xinyun Chen , Matthew Purver

T5-SR: A Unified Seq-to-Seq Decoding Strategy for Semantic Parsing

Translating natural language queries into SQLs in a seq2seq manner has attracted much attention recently. However, compared with abstract-syntactic-tree-based SQL generation, seq2seq semantic parsers face much more challenges, including…

Computation and Language · Computer Science 2023-06-16 Yuntao Li , Zhenpeng Su , Yutian Li , Hanchu Zhang , Sirui Wang , Wei Wu , Yan Zhang

SeqGenSQL -- A Robust Sequence Generation Model for Structured Query Language

We explore using T5 (Raffel et al. (2019)) to directly translate natural language questions into SQL statements. General purpose natural language that interfaces to information stored within databases requires flexibly translating natural…

Artificial Intelligence · Computer Science 2020-11-10 Ning Li , Bethany Keller , Mark Butler , Daniel Cer

Clause-Wise and Recursive Decoding for Complex and Cross-Domain Text-to-SQL Generation

Most deep learning approaches for text-to-SQL generation are limited to the WikiSQL dataset, which only supports very simple queries over a single table. We focus on the Spider dataset, a complex and cross-domain text-to-SQL task, which…

Computation and Language · Computer Science 2019-08-20 Dongjun Lee

Table Caption Generation in Scholarly Documents Leveraging Pre-trained Language Models

This paper addresses the problem of generating table captions for scholarly documents, which often require additional information outside the table. To this end, we propose a method of retrieving relevant sentences from the paper body, and…

Computation and Language · Computer Science 2021-08-19 Junjie H. Xu , Kohei Shinden , Makoto P. Kato

Knowledge Distillation for Low-Resource Open-source Text-to-SQL Model

Text-to-SQL converts natural language questions into executable SQL queries, enabling non-technical users to access relational databases for analytics and intelligent data services. In real-world scenarios, performance is often constrained…

Computation and Language · Computer Science 2026-05-25 Tianhao Qiu , Xiaojun Chen

Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions

We focus on the cross-domain context-dependent text-to-SQL generation task. Based on the observation that adjacent natural language questions are often linguistically dependent and their corresponding SQL queries tend to overlap, we utilize…

Computation and Language · Computer Science 2019-09-11 Rui Zhang , Tao Yu , He Yang Er , Sungrok Shim , Eric Xue , Xi Victoria Lin , Tianze Shi , Caiming Xiong , Richard Socher , Dragomir Radev

Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge

In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese…

Computation and Language · Computer Science 2023-01-04 Longxu Dou , Yan Gao , Xuqi Liu , Mingyang Pan , Dingzirui Wang , Wanxiang Che , Dechen Zhan , Min-Yen Kan , Jian-Guang Lou

Enhancing Text-to-SQL Capabilities of Large Language Models via Domain Database Knowledge Injection

Text-to-SQL is a subtask in semantic parsing that has seen rapid progress with the evolution of Large Language Models (LLMs). However, LLMs face challenges due to hallucination issues and a lack of domain-specific database knowledge(such as…

Computation and Language · Computer Science 2025-02-26 Xingyu Ma , Xin Tian , Lingxiang Wu , Xuepeng Wang , Xueming Tang , Jinqiao Wang

SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task

Most existing studies in text-to-SQL tasks do not require generating complex SQL queries with multiple clauses or sub-queries, and generalizing to new, unseen databases. In this paper we propose SyntaxSQLNet, a syntax tree network to…

Computation and Language · Computer Science 2018-10-29 Tao Yu , Michihiro Yasunaga , Kai Yang , Rui Zhang , Dongxu Wang , Zifan Li , Dragomir Radev

Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering

The current state-of-the-art generative models for open-domain question answering (ODQA) have focused on generating direct answers from unstructured textual information. However, a large amount of world's knowledge is stored in structured…

Computation and Language · Computer Science 2021-12-09 Alexander Hanbo Li , Patrick Ng , Peng Xu , Henghui Zhu , Zhiguo Wang , Bing Xiang

Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

A significant amount of the world's knowledge is stored in relational databases. However, the ability for users to retrieve facts from a database is limited due to a lack of understanding of query languages such as SQL. We propose Seq2SQL,…

Computation and Language · Computer Science 2017-11-13 Victor Zhong , Caiming Xiong , Richard Socher

Datrics Text2SQL: A Framework for Natural Language to SQL Query Generation

Text-to-SQL systems enable users to query databases using natural language, democratizing access to data analytics. However, they face challenges in understanding ambiguous phrasing, domain-specific vocabulary, and complex schema…

Databases · Computer Science 2025-06-17 Tetiana Gladkykh , Kyrylo Kirykov

Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering

Large Language Models (LLMs) perform well in general QA but often struggle in domain-specific scenarios. Retrieval-Augmented Generation (RAG) introduces external knowledge but suffers from hallucinations and latency due to noisy retrievals.…

Computation and Language · Computer Science 2025-09-19 Bolei He , Xinran He , Run Shao , Shanfu Shu , Xianwei Xue , Mingquan Cheng , Haifeng Li , Zhenhua Ling

Table-to-text Generation by Structure-aware Seq2seq Learning

Table-to-text generation aims to generate a description for a factual table which can be viewed as a set of field-value records. To encode both the content and the structure of a table, we propose a novel structure-aware seq2seq…

Computation and Language · Computer Science 2017-11-28 Tianyu Liu , Kexiang Wang , Lei Sha , Baobao Chang , Zhifang Sui

Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing

The task of text-to-SQL parsing, which aims at converting natural language questions into executable SQL queries, has garnered increasing attention in recent years, as it can assist end users in efficiently extracting vital information from…

Computation and Language · Computer Science 2023-01-19 Jinyang Li , Binyuan Hui , Reynold Cheng , Bowen Qin , Chenhao Ma , Nan Huo , Fei Huang , Wenyu Du , Luo Si , Yongbin Li

Data Augmentation with Hierarchical SQL-to-Question Generation for Cross-domain Text-to-SQL Parsing

Data augmentation has attracted a lot of research attention in the deep learning era for its ability in alleviating data sparseness. The lack of labeled data for unseen evaluation databases is exactly the major challenge for cross-domain…

Computation and Language · Computer Science 2022-11-16 Kun Wu , Lijie Wang , Zhenghua Li , Ao Zhang , Xinyan Xiao , Hua Wu , Min Zhang , Haifeng Wang

RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL

One of the recent best attempts at Text-to-SQL is the pre-trained language model. Due to the structural property of the SQL queries, the seq2seq model takes the responsibility of parsing both the schema items (i.e., tables and columns) and…

Computation and Language · Computer Science 2023-04-11 Haoyang Li , Jing Zhang , Cuiping Li , Hong Chen

SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction

Converting natural language queries into SQL queries is a crucial challenge in both industry and academia, aiming to increase access to databases and large-scale applications. This work examines how in-context learning and chain-of-thought…

Databases · Computer Science 2025-09-30 Saumya Chaturvedi , Aman Chadha , Laurent Bindschaedler