Related papers: Language-Integrated Recursive Queries (Full Versio…
Existing query languages for data discovery exhibit system-driven designs that emphasize database features and functionality over user needs. We propose a re-prioritization of the client through an introduction of a language-driven approach…
Traditional database queries follow a simple model: they define constraints that each tuple in the result must satisfy. This model is computationally efficient, as the database system can evaluate the query conditions on each tuple…
For decades, SQL has been the default language for composing queries, but it is increasingly used as an artifact to be read and verified rather than authored. With Large Language Models (LLMs), queries are increasingly machine-generated,…
Table learning, which lies at the intersection of machine learning and modern database systems, has recently attracted growing attention. However, existing table learning frameworks typically require explicit data export and extensive…
Language-integrated query is a powerful programming construct allowing database queries and ordinary program code to interoperate seamlessly and safely. Language-integrated query techniques rely on classical results about the nested…
The purpose of predictive modeling on relational data is to predict future or missing values in a relational database, for example, future purchases of a user, risk of readmission of the patient, or the likelihood that a financial…
Work on knowledge graphs and graph-based data management often focus either on declarative graph query languages or on frameworks for graph analytics, where there has been little work in trying to combine both approaches. However, many…
Efficient querying and analysis of large tabular datasets remain significant challenges, especially for users without expertise in programming languages like SQL. Text-to-SQL approaches have shown promising performance on benchmark data;…
Categorical Query Language is an open-source query and data integration scripting language that can be applied to common challenges in the field of computational science. We discuss how the structure-preserving nature of CQL data migrations…
The relational calculus (RC) is a concise, declarative query language. However, existing RC query evaluation approaches are inefficient and often deviate from established algorithms based on finite tables used in database management…
Translating natural language utterances to executable queries is a helpful technique in making the vast amount of data stored in relational databases accessible to a wider range of non-tech-savvy end users. Prior work in this area has…
Despite advances in large language model (LLM)-based natural language interfaces for databases, scaling to enterprise-level data catalogs remains an under-explored challenge. Prior works addressing this challenge rely on domain-specific…
Structured Query Language (SQL) remains the standard language used in Relational Database Management Systems (RDBMSs) and has found applications in healthcare (patient registries), businesses (inventories, trend analysis), military,…
We present an extension of the declarative programming language Curry to support the access to data stored in relational databases via SQL. Since Curry is statically typed, our emphasis on this SQL integration is on type safety. Our…
Text-to-SQL generation aims to translate natural language questions into SQL statements. In Text-to-SQL based on large language models, schema linking is a widely adopted strategy to streamline the input for LLMs by selecting only relevant…
Text-to-SQL enables users to interact with databases using natural language, simplifying the retrieval and synthesis of information. Despite the remarkable success of large language models (LLMs) in translating natural language questions…
A wide range of interesting program properties are intrinsically relational, i.e., they relate two or more program traces. Two prominent relational properties are secure information flow and conditional program equivalence. By showing the…
Nested relational query languages have been explored extensively, and underlie industrial language-integrated query systems such as Microsoft's LINQ. However, relational databases do not natively support nested collections in query results.…
The popularity of data science as a discipline and its importance in the emerging economy and industrial progress dictate that machine learning be democratized for the masses. This also means that the current practice of workforce training…
Recent advances in large language models (LLMs) have significantly improved the accuracy of Text-to-SQL systems. However, a critical challenge remains: the semantic mismatch between natural language questions (NLQs) and their corresponding…