Related papers: SQLStructEval: Structural Evaluation of LLM Text-t…

StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs

As Large Language Models (LLMs) become integral to software development workflows, their ability to generate structured outputs has become critically important. We introduce StructEval, a comprehensive benchmark for evaluating LLMs'…

Software Engineering · Computer Science 2026-04-06 Jialin Yang , Dongfu Jiang , Lipeng He , Sherman Siu , Yuxuan Zhang , Disen Liao , Zhuofeng Li , Huaye Zeng , Yiming Jia , Haozhe Wang , Benjamin Schneider , Chi Ruan , Wentao Ma , Zhiheng Lyu , Yifei Wang , Yi Lu , Quy Duc Do , Ziyan Jiang , Ping Nie , Wenhu Chen

Same Data, Different Schemas: Robustness of LLM-based Text-to-SQL

Large language models (LLMs) consistently achieve strong results on text-to-SQL benchmarks, but their robustness to schema variations remains poorly understood. Recent work suggests that the schema structure matters, but does not provide a…

Databases · Computer Science 2026-05-26 Nitin Kanchinadam , Aditya Menachery , Amol Deshpande

StrucText-Eval: Evaluating Large Language Model's Reasoning Ability in Structure-Rich Text

The effective utilization of structured data, integral to corporate data strategies, has been challenged by the rise of large language models (LLMs) capable of processing unstructured information. This shift prompts the question: can LLMs…

Computation and Language · Computer Science 2024-10-22 Zhouhong Gu , Haoning Ye , Xingzhou Chen , Zeyang Zhou , Hongwei Feng , Yanghua Xiao

Evaluating LLMs for Text-to-SQL Generation With Complex SQL Workload

This study presents a comparative analysis of the a complex SQL benchmark, TPC-DS, with two existing text-to-SQL benchmarks, BIRD and Spider. Our findings reveal that TPC-DS queries exhibit a significantly higher level of structural…

Databases · Computer Science 2024-07-30 Limin Ma , Ken Pu , Ying Zhu

StructTest: Benchmarking LLMs' Reasoning through Compositional Structured Outputs

The rapid advancement of large language models (LLMs) demands robust, unbiased, and scalable evaluation methods. However, human annotations are costly to scale, model-based evaluations are susceptible to stylistic biases, and…

Computation and Language · Computer Science 2025-03-21 Hailin Chen , Fangkai Jiao , Mathieu Ravaut , Nawshad Farruque , Xuan Phi Nguyen , Chengwei Qin , Manan Dey , Bosheng Ding , Caiming Xiong , Shafiq Joty , Yingbo Zhou

StructText: A Synthetic Table-to-Text Approach for Benchmark Generation with Multi-Dimensional Evaluation

Extracting structured information from text, such as key-value pairs that could augment tabular data, is quite useful in many enterprise use cases. Although large language models (LLMs) have enabled numerous automated pipelines for…

Computation and Language · Computer Science 2025-07-30 Satyananda Kashyap , Sola Shirai , Nandana Mihindukulasooriya , Horst Samulowitz

SynSQL: Synthesizing Relational Databases for Robust Evaluation of Text-to-SQL Systems

Evaluating text-to-SQL systems remains largely fragile: correctness is typically judged by executing predicted and gold SQL queries on a single static database, even though the same queries may behave differently under alternative database…

Databases · Computer Science 2026-05-01 Mohammadamin Habibollah , Davood Rafiei

SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended)

Text-to-SQL, the process of translating natural language into Structured Query Language (SQL), represents a transformative application of large language models (LLMs), potentially revolutionizing how humans interact with data. This paper…

Computation and Language · Computer Science 2024-04-02 Ruoxi Sun , Sercan Ö. Arik , Alex Muzio , Lesly Miculicich , Satya Gundabathula , Pengcheng Yin , Hanjun Dai , Hootan Nakhost , Rajarishi Sinha , Zifeng Wang , Tomas Pfister

Large Language Model Enhanced Text-to-SQL Generation: A Survey

Text-to-SQL translates natural language queries into Structured Query Language (SQL) commands, enabling users to interact with databases using natural language. Essentially, the text-to-SQL task is a text generation task, and its…

Databases · Computer Science 2024-10-10 Xiaohu Zhu , Qian Li , Lizhen Cui , Yongkang Liu

SQL-to-Text Generation with Weighted-AST Few-Shot Prompting

SQL-to-Text generation aims at translating structured SQL queries into natural language descriptions, thereby facilitating comprehension of complex database operations for non-technical users. Although large language models (LLMs) have…

Databases · Computer Science 2025-11-19 Sriom Chakrabarti , Chuangtao Ma , Arijit Khan , Sebastian Link

SQLForge: Synthesizing Reliable and Diverse Data to Enhance Text-to-SQL Reasoning in LLMs

Large Language models (LLMs) have demonstrated significant potential in text-to-SQL reasoning tasks, yet a substantial performance gap persists between existing open-source models and their closed-source counterparts. In this paper, we…

Computation and Language · Computer Science 2025-09-23 Yu Guo , Dong Jin , Shenghao Ye , Shuangwu Chen , Jian Yang , Xiaobin Tan

TrustSQL: Benchmarking Text-to-SQL Reliability with Penalty-Based Scoring

Text-to-SQL enables users to interact with databases using natural language, simplifying the retrieval and synthesis of information. Despite the remarkable success of large language models (LLMs) in translating natural language questions…

Artificial Intelligence · Computer Science 2024-07-03 Gyubok Lee , Woosog Chay , Seonhee Cho , Edward Choi

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL task. However, the absence of a systematical benchmark inhibits the development of designing effective, efficient and economic LLM-based Text-to-SQL solutions. To…

Databases · Computer Science 2023-11-21 Dawei Gao , Haibin Wang , Yaliang Li , Xiuyu Sun , Yichen Qian , Bolin Ding , Jingren Zhou

ErrorLLM: Modeling SQL Errors for Text-to-SQL Refinement

Despite the remarkable performance of large language models (LLMs) in text-to-SQL (SQL generation), correctly producing SQL queries remains challenging during initial generation. The SQL refinement task is subsequently introduced to correct…

Computation and Language · Computer Science 2026-03-05 Zijin Hong , Hao Chen , Zheng Yuan , Qinggang Zhang , Luyao Zhuang , Qing Liao , Feiran Huang , Yangqiu Song , Xiao Huang

Structure Guided Large Language Model for SQL Generation

Recent advancements in large language models (LLMs) have shown promise in bridging the gap between natural language queries and database management systems, enabling users to interact with databases without the background of SQL. However,…

Databases · Computer Science 2025-07-11 Qinggang Zhang , Hao Chen , Junnan Dong , Shengyuan Chen , Feiran Huang , Xiao Huang

OmniStruct: Universal Text-to-Structure Generation across Diverse Schemas

The ability of Large Language Models (LLMs) to generate structured outputs that follow arbitrary schemas is crucial to a wide range of downstream tasks that require diverse structured representations of results such as information…

Computation and Language · Computer Science 2025-11-25 James Y. Huang , Wenxuan Zhou , Nan Xu , Fei Wang , Qin Liu , Sheng Zhang , Hoifung Poon , Muhao Chen

Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL

Generating accurate SQL from users' natural language questions (text-to-SQL) remains a long-standing challenge due to the complexities involved in user question understanding, database schema comprehension, and SQL generation. Traditional…

Computation and Language · Computer Science 2025-11-25 Zijin Hong , Zheng Yuan , Qinggang Zhang , Hao Chen , Junnan Dong , Feiran Huang , Xiao Huang

Structure-BiEval: A Self-Supervised, Dual-Track Framework for Decoupling Structure and Content in LLM Evaluation for Web Information Systems

As Large Language Models (LLMs) evolve into the core of Web-based autonomous agents and complex Web Information Systems, their ability to faithfully translate natural language into rigorous structured formats has become paramount, as this…

Computation and Language · Computer Science 2026-05-18 Boxiang Zhao , Qince Li , Zhonghao Wang , Zelin Cao , Yi Wang , Peng Cheng , Bo Lin

StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation

Evaluation is the baton for the development of large language models. Current evaluations typically employ a single-item assessment paradigm for each atomic test objective, which struggles to discern whether a model genuinely possesses the…

Computation and Language · Computer Science 2024-08-08 Boxi Cao , Mengjie Ren , Hongyu Lin , Xianpei Han , Feng Zhang , Junfeng Zhan , Le Sun

RSL-SQL: Robust Schema Linking in Text-to-SQL Generation

Text-to-SQL generation aims to translate natural language questions into SQL statements. In Text-to-SQL based on large language models, schema linking is a widely adopted strategy to streamline the input for LLMs by selecting only relevant…

Computation and Language · Computer Science 2024-11-27 Zhenbiao Cao , Yuanlei Zheng , Zhihao Fan , Xiaojin Zhang , Wei Chen , Xiang Bai