Related papers: Problem Formulation and Fairness
Data science has employed great research efforts in developing advanced analytics, improving data models and cultivating new algorithms. However, not many authors have come across the organizational and socio-technical challenges that arise…
In response to public scrutiny of data-driven algorithms, the field of data science has adopted ethics training and principles. Although ethics can help data scientists reflect on certain normative aspects of their work, such efforts are…
Data-driven science is an emerging paradigm where scientific discoveries depend on the execution of computational AI models against rich, discipline-specific datasets. With modern machine learning frameworks, anyone can develop and execute…
Data science requires time-consuming iterative manual activities. In particular, activities such as data selection, preprocessing, transformation, and mining, highly depend on iterative trial-and-error processes that could be sped-up…
While data science has emerged as a contentious new scientific field, enormous debates and discussions have been made on it why we need data science and what makes it as a science. In reviewing hundreds of pieces of literature which include…
Classification, a heavily-studied data-driven machine learning task, drives an increasing number of prediction systems involving critical human decisions such as loan approval and criminal risk assessment. However, classifiers often…
Causal inference from observational data is the goal of many data analyses in the health and social sciences. However, academic statistics has often frowned upon data analyses with a causal objective. The introduction of the term "data…
The trustworthiness of data science systems in applied and real-world settings emerges from the resolution of specific tensions through situated, pragmatic, and ongoing forms of work. Drawing on research in CSCW, critical data studies, and…
A suite of impressive scientific discoveries have been driven by recent advances in artificial intelligence. These almost all result from training flexible algorithms to solve difficult optimization problems specified in advance by teams of…
The field of data science currently enjoys a broad definition that includes a wide array of activities which borrow from many other established fields of study. Having such a vague characterization of a field in the early stages might be…
This manuscript provides a systemic and data-centric view of what we term essential data science, as a natural ecosystem with challenges and missions stemming from the fusion of data universe with its multiple combinations of the 5D…
Data analysis requires translating higher level questions and hypotheses into computable statistical models. We present a mixed-methods study aimed at identifying the steps, considerations, and challenges involved in operationalizing…
In this paper we argue that data science is a coherent and novel approach to empirical problems that, in its most general form, does not build understanding about phenomena. Within the new type of mathematization at work in data science,…
In recent years, data science has become an indispensable part of our society. Over time, we have become reliant on this technology because of its opportunity to gain value and new insights from data in any field - business, socializing,…
Background. As artificial intelligence and AI-powered systems continue to grow, the role of data scientists has become essential in software development environments. Data scientists face challenges related to managing large volumes of data…
Data and Science has stood out in the generation of results, whether in the projects of the scientific domain or business domain. CERN Project, Scientific Institutes, companies like Walmart, Google, Apple, among others, need data to present…
Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data…
The data science revolution has led to an increased interest in the practice of data analysis. While much has been written about statistical thinking, a complementary form of thinking that appears in the practice of data analysis is design…
Society's capacity for algorithmic problem-solving has never been greater. Artificial Intelligence is now applied across more domains than ever, a consequence of powerful abstractions, abundant data, and accessible software. As capabilities…
A fundamental problem in the practice and teaching of data science is how to evaluate the quality of a given data analysis, which is different than the evaluation of the science or question underlying the data analysis. Previously, we…