English
Related papers

Related papers: BugDoc: Algorithms to Debug Computational Processe…

200 papers

Machine learning tasks entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous or uninformative outputs, the pipeline may fail or produce…

Machine Learning · Computer Science 2020-02-13 Raoni Lourenço , Juliana Freire , Dennis Shasha

We study the problem of troubleshooting machine learning systems that rely on analytical pipelines of distinct components. Understanding and fixing errors that arise in such integrative systems is difficult as failures can occur at multiple…

Machine Learning · Computer Science 2016-11-28 Besmira Nushi , Ece Kamar , Eric Horvitz , Donald Kossmann

Machine learning in practice often involves complex pipelines for data cleansing, feature engineering, preprocessing, and prediction. These pipelines are composed of operators, which have to be correctly connected and whose hyperparameters…

Software Engineering · Computer Science 2023-10-03 Julian Dolby , Jason Tsay , Martin Hirzel

Data volumes and rates of research infrastructures will continue to increase in the upcoming years and impact how we interact with their final data products. Little of the processed data can be directly investigated and most of it will be…

Instrumentation and Methods for Astrophysics · Physics 2024-04-23 Michael A. C. Johnson , Hans-Rainer Klöckner , Albina Muzafarova , Kristen Lackeos , David J. Champion , Marta Dembska , Sirko Schindler , Marcus Paradies

Evaluating the computational reproducibility of data analysis pipelines has become a critical issue. It is, however, a cumbersome process for analyses that involve data from large populations of subjects, due to their computational and…

Methodology · Statistics 2018-09-28 Soudabeh Barghi , Lalet Scaria , Ali Salari , Tristan Glatard

Mapping applications onto heterogeneous platforms is a difficult challenge, even for simple application patterns such as pipeline graphs. The problem is even more complex when processors are subject to failure during the execution of the…

Distributed, Parallel, and Cluster Computing · Computer Science 2008-03-26 Anne Benoit , Veronika Rehn-Sonigo , Yves Robert

Data analysis pipelines are known to be impacted by computational conditions, presumably due to the creation and propagation of numerical errors. While this process could play a major role in the current reproducibility crisis, the precise…

Quantitative Methods · Quantitative Biology 2020-09-30 Ali Salari , Gregory Kiar , Lindsay Lewis , Alan C. Evans , Tristan Glatard

IT infrastructure is a crucial part in most of today's business operations. High availability and reliability, and short response times to outages are essential. Thus a high amount of tool support and automation in risk management is…

Artificial Intelligence · Computer Science 2015-11-19 Joerg Schoenfisch , Janno von Stulpnagel , Jens Ortmann , Christian Meilicke , Heiner Stuckenschmidt

Debugging Cyber-Physical System (CPS) models can be extremely complex. Indeed, only the detection of a failure is insuffcient to know how to correct a faulty model. Faults can propagate in time and in space producing observable…

Software Engineering · Computer Science 2020-10-14 Ezio Bartocci , Niveditha Manjunath , Leonardo Mariani , Cristinel Mateis , Dejan Ničković

Context: Mining software repositories is a popular means to gain insights into a software project's evolution, monitor project health, support decisions and derive best practices. Tools supporting the mining process are commonly applied by…

Software Engineering · Computer Science 2025-11-13 Nicole Hoess , Carlos Paradis , Rick Kazman , Wolfgang Mauerer

The paper is devoted to studying the performance of a computational pipeline, the number of simultaneously executing stages of which at each time is bounded from above by a fixed number. A look at the restriction as a structural hazard…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-07-31 Ahmet A. Husainov

Data science requires time-consuming iterative manual activities. In particular, activities such as data selection, preprocessing, transformation, and mining, highly depend on iterative trial-and-error processes that could be sped-up…

Systematic reviews, which entail the extraction of data from large numbers of scientific documents, are an ideal avenue for the application of machine learning. They are vital to many fields of science and philanthropy, but are very…

Computation and Language · Computer Science 2020-10-12 Seraphina Goldfarb-Tarrant , Alexander Robertson , Jasmina Lazic , Theodora Tsouloufi , Louise Donnison , Karen Smyth

Data science relies on pipelines that are organized in the form of interdependent computational steps. Each step consists of various candidate algorithms that maybe used for performing a particular function. Each algorithm consists of…

Computer Vision and Pattern Recognition · Computer Science 2019-03-04 Aritra Chowdhury , Malik Magdon-Ismail , Bulent Yener

Business intelligence (BI) is any knowledge derived from existing data that may be strategically applied within a business. Data mining is a technique or method for extracting BI from data using statistical data modeling. Finding…

Artificial Intelligence · Computer Science 2022-11-15 Shubham Thakar , Dhananjay Kalbande

Pipelining is a design technique for logical circuits that allows for higher throughput than circuits in which multiple computations are fed through the system one after the other. It allows for much faster computation than architectures in…

Computational Physics · Physics 2024-10-28 Ian Seet , Thomas E. Ouldridge , Jonathan P. K. Doye

Causality is a fundamental part of the scientific endeavour to understand the world. Unfortunately, causality is still taboo in much of psychology and social science. Motivated by a growing number of recommendations for the importance of…

Methodology · Statistics 2022-06-27 Matthew J. Vowels

Causal-consistent reversible debugging allows one to explore concurrent computations back and forth in order to locate the source of an error. In this setting, backward steps can be chosen freely as long as they are "causal consistent",…

Programming Languages · Computer Science 2024-06-11 Juan José González-Abril , Germán Vidal

The effectiveness of the machine learning methods for real-world tasks depends on the proper structure of the modeling pipeline. The proposed approach is aimed to automate the design of composite machine learning pipelines, which is…

We describe the integration of logical and uncertain reasoning methods to identify the likely source and location of software problems. To date, software engineers have had few tools for identifying the sources of error in complex software…

Artificial Intelligence · Computer Science 2013-03-08 Lisa J. Burnell , Eric J. Horvitz
‹ Prev 1 2 3 10 Next ›