Related papers: Cohort Query Processing

COHORTNEY: Non-Parametric Clustering of Event Sequences

Cohort analysis is a pervasive activity in web analytics. One divides users into groups according to specific criteria and tracks their behavior over time. Despite its extensive use, academic circles do not discuss cohort analysis to…

Machine Learning · Computer Science 2021-06-15 Vladislav Zhuzhel , Rodrigo Rivera-Castro , Nina Kaploukhaya , Liliya Mironova , Alexey Zaytsev , Evgeny Burnaev

Optimizing Relational Queries over Array-Valued Data in Columnar Systems

Modern analytical workloads increasingly combine relational data with array-valued attributes. While columnar database systems efficiently process such workloads, their ability to optimize queries that interleave relational operators with…

Databases · Computer Science 2026-04-03 Maroua Zeblah , Etienne Couritas , Sarah Chlyah , Pierre Genevès , Nils Gesbert , Nabil Layaïda

A computational model for analytic column stores

This work presents an abstract model for the computations performed by analytic column stores or columnar query processors. The model is based on circuits whose wires carry columns rather than scalar values, and whose nodes apply operators…

Databases · Computer Science 2019-11-13 Eyal Rozenberg

CohortVA: A Visual Analytic System for Interactive Exploration of Cohorts based on Historical Data

In history research, cohort analysis seeks to identify social structures and figure mobilities by studying the group-based behavior of historical figures. Prior works mainly employ automatic data mining approaches, lacking effective visual…

Human-Computer Interaction · Computer Science 2022-08-22 Wei Zhang , Jason K. Wong , Xumeng Wang , Youcheng Gong , Rongchen Zhu , Kai Liu , Zihan Yan , Siwei Tan , Huamin Qu , Siming Chen , Wei Chen

CohortNet: Empowering Cohort Discovery for Interpretable Healthcare Analytics

Cohort studies are of significant importance in the field of healthcare analysis. However, existing methods typically involve manual, labor-intensive, and expert-driven pattern definitions or rely on simplistic clustering techniques that…

Machine Learning · Computer Science 2024-06-21 Qingpeng Cai , Kaiping Zheng , H. V. Jagadish , Beng Chin Ooi , James Yip

COMPARE: Accelerating Groupwise Comparison in Relational Databases for Data Analytics

Data analysis often involves comparing subsets of data across many dimensions for finding unusual trends and patterns. While the comparison between subsets of data can be expressed using SQL, they tend to be complex to write, and suffer…

Databases · Computer Science 2021-07-28 Tarique Siddiqui , Surajit Chaudhuri , Vivek Narasayya

Leveraging Foundation Language Models (FLMs) for Automated Cohort Extraction from Large EHR Databases

A crucial step in cohort studies is to extract the required cohort from one or more study datasets. This step is time-consuming, especially when a researcher is presented with a dataset that they have not previously worked with. When the…

Machine Learning · Computer Science 2024-12-17 Purity Mugambi , Alexandra Meliou , Madalina Fiterau

Columnar Storage and List-based Processing for Graph Database Management Systems

We revisit column-oriented storage and query processing techniques in the context of contemporary graph database management systems (GDBMSs). Similar to column-oriented RDBMSs, GDBMSs support read-heavy analytical workloads that however…

Databases · Computer Science 2021-10-29 Pranjal Gupta , Amine Mhedhbi , Semih Salihoglu

Generating patient cohorts from electronic health records using two-step retrieval-augmented text-to-SQL generation

Clinical cohort definition is crucial for patient recruitment and observational studies, yet translating inclusion/exclusion criteria into SQL queries remains challenging and manual. We present an automated system utilizing large language…

Computation and Language · Computer Science 2025-10-20 Angelo Ziletti , Leonardo D'Ambrosi

Facilitating SQL Query Composition and Analysis

Formulating efficient SQL queries requires several cycles of tuning and execution, particularly for inexperienced users. We examine methods that can accelerate and improve this interaction by providing insights about SQL queries prior to…

Databases · Computer Science 2020-02-24 Zainab Zolaktaf , Mostafa Milani , Rachel Pottinger

Fast Updates on Read-Optimized Databases Using Multi-Core CPUs

Read-optimized columnar databases use differential updates to handle writes by maintaining a separate write-optimized delta partition which is periodically merged with the read-optimized and compressed main partition. This merge process…

Databases · Computer Science 2015-03-19 Jens Krueger , Changkyu Kim , Martin Grund , Nadathur Satish , David Schwalb , Jatin Chhugani , Hasso Plattner , Pradeep Dubey , Alexander Zeier

An Automated SQL Query Grading System Using An Attention-Based Convolutional Neural Network

Grading SQL queries can be a time-consuming, tedious and challenging task, especially as the number of student submissions increases. Several systems have been introduced in an attempt to mitigate these challenges, but those systems have…

Computers and Society · Computer Science 2024-06-25 Donald R. Schwartz , Pablo Rivas

Cortex: Harnessing Correlations to Boost Query Performance

Databases employ indexes to filter out irrelevant records, which reduces scan overhead and speeds up query execution. However, this optimization is only available to queries that filter on the indexed attribute. To extend these speedups to…

Databases · Computer Science 2020-12-15 Vikram Nathan , Jialin Ding , Tim Kraska , Mohammad Alizadeh

A Statistical Approach Towards Robust Progress Estimation

The need for accurate SQL progress estimation in the context of decision support administration has led to a number of techniques proposed for this task. Unfortunately, no single one of these progress estimators behaves robustly across the…

Databases · Computer Science 2012-01-04 Arnd Christian König , Bolin Ding , Surajit Chaudhuri , Vivek Narasayya

Finding a Second Wind: Speeding Up Graph Traversal Queries in RDBMSs Using Column-Oriented Processing

Recursive queries and recursive derived tables constitute an important part of the SQL standard. Their efficient processing is important for many real-life applications that rely on graph or hierarchy traversal. Position-enabled…

Databases · Computer Science 2023-08-21 Mikhail Firsov , Michael Polyntsov , Kirill Smirnov , George Chernishev

A Case for A Collaborative Query Management System

Over the past 40 years, database management systems (DBMSs) have evolved to provide a sophisticated variety of data management capabilities. At the same time, tools for managing queries over the data have remained relatively primitive. One…

Databases · Computer Science 2009-09-15 Nodira Khoussainova , Magda Balazinska , Wolfgang Gatterbauer , YongChul Kwon , Dan Suciu

An Overview of Query Processing on Crowdsourced Databases

Crowd-sourcing is a powerful solution for finding correct answers to expensive and unanswered queries in databases, including those with uncertain and incomplete data. Attempts to use crowd-sourcing to exploit human abilities to process…

Databases · Computer Science 2022-04-19 Marwa B. Swidan , Ali A. Alwan , Yonis Gulzar , Abedallah Zaid Abualkishik

Subset Queries in Relational Databases

In this paper, we motivated the need for relational database systems to support subset query processing. We defined new operators in relational algebra, and new constructs in SQL for expressing subset queries. We also illustrated the…

Databases · Computer Science 2007-05-23 Satyanarayana R Valluri , Kamalakar Karlapalem

Development of Data Evaluation Benchmark for Data Wrangling Recommendation System

CoWrangler is a data-wrangling recommender system designed to streamline data processing tasks. Recognizing that data processing is often time-consuming and complex for novice users, we aim to simplify the decision-making process regarding…

Databases · Computer Science 2024-09-18 Yuqing Wang , Anna Fariha

Processing a Trillion Cells per Mouse Click

Column-oriented database systems have been a real game changer for the industry in recent years. Highly tuned and performant systems have evolved that provide users with the possibility of answering ad hoc queries over large datasets in an…

Databases · Computer Science 2012-08-02 Alexander Hall , Olaf Bachmann , Robert Büssow , Silviu Gănceanu , Marc Nunkesser