Related papers: Design Principles for Data Analysis
The data revolution has led to an increased interest in the practice of data analysis. For a given problem, there can be significant or subtle differences in how a data analyst constructs or creates a data analysis, including differences in…
A challenge that data analysts face is building a data analysis that is useful for a given consumer. Previously, we defined a set of principles for describing data analyses that can be used to create a data analysis and to characterize the…
A fundamental problem in the practice and teaching of data science is how to evaluate the quality of a given data analysis, which is different than the evaluation of the science or question underlying the data analysis. Previously, we…
The undergraduate curriculum in statistics and data science is undergoing changes to accommodate new methods, newly interested students, and the changing role of statistics in society. Because of this, it is more important than ever that…
Data analyses are often constructed in an imperative manner, where commands representing actions taken on the data are issued sequentially. The publication of these commands, along with the data, is essential to the reproducibility of the…
Professional roles for data visualization designers are growing in popularity, and interest in relationships between the academic research and professional practice communities is gaining traction. However, despite the potential for…
The ability to make decisions based on data, with its inherent uncertainties and variability, is a complex and vital skill in the modern world. The need for such quantitative critical thinking occurs in many different contexts, and while it…
Data science is an emerging interdisciplinary field that combines elements of mathematics, statistics, computer science, and knowledge in a particular application domain for the purpose of extracting meaningful information from the…
Data-driven analysis of business processes has a long tradition in research. However, recently the term of process mining is mostly used when referring to data-driven process analysis. As a consequence, awareness for the many facets of…
Data visualization is becoming an increasingly popular field of design practice. Although many studies have highlighted the knowledge required for effective data visualization design, their focus has largely been on formal knowledge and…
We introduce the notion of a data-first design study which is triggered by the acquisition of real-world data instead of specific stakeholder analysis questions. We propose an adaptation of the design study methodology framework to provide…
Data science is the business of learning from data, which is traditionally the business of statistics. Data science, however, is often understood as a broader, task-driven and computationally-oriented version of statistics. Both the term…
The development and approval of new treatments generates large volumes of results, such as summaries of efficacy and safety. However, it is commonly overlooked that analyzing clinical study data also produces data in the form of results.…
A growing number of students are completing undergraduate degrees in statistics and entering the workforce as data analysts. In these positions, they are expected to understand how to utilize databases and other data warehouses, scrape data…
Causal inference from observational data is the goal of many data analyses in the health and social sciences. However, academic statistics has often frowned upon data analyses with a causal objective. The introduction of the term "data…
Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and…
Data analysis is a powerful tool in all experimental sciences. Statistical methods, such as sampling theory, computer technologies necessary for handling large amounts of data, skill in analysing information contained in different types of…
Decision support tools enable improved decision-making for challenging decision problems by empowering stakeholders to process, analyze, visualize, and otherwise make sense of a variety of key factors. Their intentional design is a critical…
Employees work in increasingly digital environments that enable advanced analytics. Yet, they lack oversight over the systems that process their data. That means that potential analysis errors or hidden biases are hard to uncover. Recent…
The opacity of machine learning data is a significant threat to ethical data work and intelligible systems. Previous research has addressed this issue by proposing standardized checklists to document datasets. This paper expands that field…