English
Related papers

Related papers: Optimizing Provenance Computations

200 papers

A well-established technique for capturing database provenance as annotations on data is to instrument queries to propagate such annotations. However, even sophisticated query optimizers often fail to produce efficient execution plans for…

We study in this paper provenance information for queries with aggregation. Provenance information was studied in the context of various query languages that do not allow for aggregation, and recent work has suggested to capture provenance…

Databases · Computer Science 2015-03-17 Yael Amsterdamer , Daniel Deutch , Val Tannen

Data analytics often involves hypothetical reasoning: repeatedly modifying the data and observing the induced effect on the computation result of a data-centric application. Recent work has proposed to leverage ideas from data provenance…

Databases · Computer Science 2020-07-13 Daniel Deutch , Yuval Moskovitch , Noam Rinetzky

Data provenance has numerous applications in the context of data preparation pipelines. It can be used for debugging faulty pipelines, interpreting results, verifying fairness, and identifying data quality issues, which may affect the…

Databases · Computer Science 2025-11-06 Khalid Belhajjame , Haroun Mezrioui , Yuyan Zhao

Provenance encodes information that connects datasets, their generation workflows, and associated metadata (e.g., who or when executed a query). As such, it is instrumental for a wide range of critical governance applications (e.g.,…

Data analytics often involves hypothetical reasoning: repeatedly modifying the data and observing the induced effect on the computation result of a data-centric application. Previous work has shown that fine-grained data provenance can help…

Databases · Computer Science 2020-07-13 Daniel Deutch , Yuval Moskovitch , Noam Rinetzky

Profile-Guided Optimization (PGO) is an excellent means to improve the performance of a compiled program. Indeed, the execution path data it provides helps the compiler to generate better code and better cacheline packing. At the time of…

Programming Languages · Computer Science 2014-11-25 Baptiste Wicht , Roberto A. Vitillo , Dehao Chen , David Levinthal

Provenance is an increasing concern due to the ongoing revolution in sharing and processing scientific data on the Web and in other computer systems. It is proposed that many computer systems will need to become provenance-aware in order to…

Programming Languages · Computer Science 2014-01-06 Umut A. Acar , Amal Ahmed , James Cheney , Roly Perera

Demand is growing for more accountability regarding the technological systems that increasingly occupy our world. However, the complexity of many of these systems - often systems-of-systems - poses accountability challenges. A key reason…

Computers and Society · Computer Science 2019-11-18 Jatinder Singh , Jennifer Cobbe , Chris Norval

Organizations of all kinds, whether public or private, profit-driven or non-profit, and across various industries and sectors, rely on dashboards for effective data visualization. However, the reliability and efficacy of these dashboards…

Human-Computer Interaction · Computer Science 2023-09-19 Johne Jarske , Jorge Rady , Lucia V. L. Filgueiras , Leandro M. Velloso , Tania L. Santos

Effective provenance tracking enhances reproducibility, governance, and data quality in array workflows. However, significant challenges arise in capturing this provenance, including: (1) rapidly evolving APIs, (2) diverse operation types,…

Databases · Computer Science 2025-06-24 Jinjin Zhao , Sanjay Krishnan

Provenance in scientific workflows is essential for understand- ing and reproducing processes, while in business processes, it can ensure compliance and correctness and facilitates process mining. However, the provenance of process…

Cryptography and Security · Computer Science 2025-10-08 Ludwig Stage , Mirela Riveni , Raimundas Matulevičius , Dimka Karastoyanova

Provenance plays a crucial role in scientific workflow execution, for instance by providing data for failure analysis, real-time monitoring, or statistics on resource utilization for right-sizing allocations. The workflows themselves,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-12 Vasilis Bountris , Lauritz Thamsen , Ulf Leser

Data provenance, or data lineage, describes the life cycle of data. In scientific workflows on HPC systems, scientists often seek diverse provenance (e.g., origins of data products, usage patterns of datasets). Unfortunately, existing…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-03 Runzhou Han , Mai Zheng , Suren Byna , Houjun Tang , Bin Dong , Dong Dai , Yong Chen , Dongkyun Kim , Joseph Hassoun , David Thorsley , Matthew Wolf

Provenance is derivative journal information about the origin and activities of system data and processes. For a highly dynamic system like the cloud, provenance can be accurately detected and securely used in cloud digital forensic…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-09-22 Asif Imran , Nadia Nahar , Kazi Sakib

Data provenance collects comprehensive information about the events and operations in a computer system at both application and system levels. It provides a detailed and accurate history of transactions that help delineate the data flow…

Cryptography and Security · Computer Science 2021-07-06 Md Morshed Alam , Weichao Wang

In today's data-driven ecosystems, ensuring data integrity, traceability and accountability is important. Provenance polynomials constitute a powerful formalism for tracing the origin and the derivations made to produce database query…

Databases · Computer Science 2025-08-21 Paulo Pintor , Rogério Costa , José Moreira

As the demand for large scale AI models continues to grow, the optimization of their training to balance computational efficiency, execution time, accuracy and energy consumption represents a critical multidimensional challenge. Achieving…

Machine Learning · Computer Science 2025-07-03 Gabriele Padovani , Valentine Anantharaj , Sandro Fiore

The algebraic approach for provenance tracking, originating in the semiring model of Green et. al, has proven useful as an abstract way of handling metadata. Commutative Semirings were shown to be the "correct" algebraic structure for Union…

Databases · Computer Science 2020-07-13 Pierre Bourhis , Daniel Deutch , Yuval Moskovitch

Users typically interact with a database by asking queries and examining the results. We refer to the user examining the query results and asking follow-up questions as query result exploration. Our work builds on two decades of provenance…

Databases · Computer Science 2020-06-11 Murali Mani , Naveenkumar Singaraj , Zhenyan Liu
‹ Prev 1 2 3 10 Next ›