Related papers: Measures in SQL
In recent years, the surge in unstructured data analysis, facilitated by advancements in Machine Learning (ML), has prompted diverse approaches for handling images, text documents, and videos. Analysts, leveraging ML models, can extract…
Data analysis often involves comparing subsets of data across many dimensions for finding unusual trends and patterns. While the comparison between subsets of data can be expressed using SQL, they tend to be complex to write, and suffer…
Within the big data tsunami, relational databases and SQL are still there and remain mandatory in most of cases for accessing data. On the one hand, SQL is easy-to-use by non specialists and allows to identify pertinent initial data at the…
Structured Query Language (SQL) has remained the standard query language for databases. SQL is highly optimized for processing structured data laid out in relations. Meanwhile, in the present application development landscape, it is highly…
Big data management aims to establish data hubs that support data in multiple models and types in an all-around way. Thus, the multi-model database system is a promising architecture for building such a multi-model data store. For an…
Semantic measures are widely used today to estimate the strength of the semantic relationship between elements of various types: units of language (e.g., words, sentences, documents), concepts or even instances semantically characterized…
The rise of large language models (LLMs) has significantly impacted various domains, including natural language processing (NLP) and image generation, by making complex computational tasks more accessible. While LLMs demonstrate impressive…
To elevate the foundational capabilities and generalization prowess of the text-to-SQL model in real-world applications, we integrate model interpretability analysis with execution-guided strategy for semantic parsing of WHERE clauses in…
We introduce SQLSpace, a human-interpretable, generalizable, compact representation for text-to-SQL examples derived with minimal human intervention. We demonstrate the utility of these representations in evaluation with three use cases:…
This paper defines a constraint-based model dedicated to multidimensional databases. The model we define represents data through a constellation of facts (subjects of analyse) associated to dimensions (axis of analyse), which are possibly…
In the context-dependent Text-to-SQL task, the generated SQL statements are refined iteratively based on the user input utterance from each interaction. The input text from each interaction can be viewed as component modifications to the…
Tests are essential in Information Retrieval and Data Mining in order to evaluate the effectiveness of a query. An automatic measure tool intended to exhibit the meaning of words in context has been developed and linked with Quantum Theory,…
This paper introduces "Semantic Scaling," a novel method for ideal point estimation from text. I leverage large language models to classify documents based on their expressed stances and extract survey-like data. I then use item response…
We present ir-measures, a new tool that makes it convenient to calculate a diverse set of evaluation measures used in information retrieval. Rather than implementing its own measure calculations, ir-measures provides a common interface to a…
To be informative, an evaluation must measure how well systems generalize to realistic unseen data. We identify limitations of and propose improvements to current evaluations of text-to-SQL systems. First, we compare human-generated and…
Structured Query Language (SQL) remains the standard language used in Relational Database Management Systems (RDBMSs) and has found applications in healthcare (patient registries), businesses (inventories, trend analysis), military,…
Computational measures of semantic similarity between geographic terms provide valuable support across geographic information retrieval, data mining, and information integration. To date, a wide variety of approaches to geo-semantic…
As information becomes increasingly sizable for organizations to maintain the challenge of organizing data still remains. More importantly, the on-going process of analysing incoming data occurs on a continual basis and organizations should…
Sentiment analysis is one of the most widely used techniques in text analysis. Recent advancements with Large Language Models have made it more accurate and accessible than ever, allowing researchers to classify text with only a plain…
SQL is the world's most popular declarative language, forming the basis of the multi-billion-dollar database industry. Although SQL has been standardized, the full standard is based on ambiguous natural language rather than formal…