English
Related papers

Related papers: Data Makes Better Data Scientists

200 papers

It is important for researchers to understand precisely how data scientists turn raw data into insights, including typical programming patterns, workflow, and methodology. This paper contributes a novel system, called DataInquirer, that…

Human-Computer Interaction · Computer Science 2024-05-29 Jinjin Zhao , Avidgor Gal , Sanjay Krishnan

By bringing together code, text, and examples, Jupyter notebooks have become one of the most popular means to produce scientific results in a productive and reproducible way. As many of the notebook authors are experts in their scientific…

Software Engineering · Computer Science 2019-06-13 Jiawei Wang , Li Li , Andreas Zeller

Interactive notebooks, such as Jupyter, have revolutionized the field of data science by providing an integrated environment for data, code, and documentation. However, their adoption by robotics researchers and model developers has been…

Computational Engineering, Finance, and Science · Computer Science 2024-05-15 Rolando Garcia

In software engineering, numerous studies have focused on the analysis of fine-grained logs, leading to significant innovations in areas such as refactoring, security, and code completion. However, no similar studies have been conducted for…

Despite the widespread adoption of computational notebooks, little is known about best practices for their usage in collaborative contexts. In this paper, we fill this gap by eliciting a catalog of best practices for collaborative data…

Human-Computer Interaction · Computer Science 2022-02-16 Luigi Quaranta , Fabio Calefato , Filippo Lanubile

Computational notebooks are intended to prioritize the needs of scientists, but little is known about how scientists interact with notebooks, what requirements drive scientists' software development processes, or what tactics scientists use…

Software Engineering · Computer Science 2025-03-18 Ruanqianqian Huang , Savitha Ravi , Michael He , Boyu Tian , Sorin Lerner , Michael Coblenz

The development of data science expertise requires tacit, process-oriented skills that are difficult to teach directly. This study addresses the resulting challenge of empirically understanding how the problem-solving processes of experts…

Computers and Society · Computer Science 2026-02-18 Manuel Valle Torre , Marcus Specht , Catharine Oertel

Background. Jupyter notebooks are one of the main tools used by data scientists. Notebooks include features (configuration scripts, markdown, images, etc.) that make them challenging to analyze compared to traditional software. As a result,…

Software Engineering · Computer Science 2025-07-28 Wenyuan Jiang , Diany Pressato , Harsh Darji , Thibaud Lutellier

Many data science students and practitioners don't see the value in making time to learn and adopt good coding practices as long as the code "works". However, code standards are an important part of modern data science practice, and they…

Computation · Statistics 2023-08-29 Randall Pruim , Maria-Cristiana Gîrjău , Nicholas J. Horton

The massive trend of integrating data-driven AI capabilities into traditional software systems is rising new intriguing challenges. One of such challenges is achieving a smooth transition from the explorative phase of Machine Learning…

Software Engineering · Computer Science 2022-05-25 Luigi Quaranta

Reproducibility of computational studies is a hallmark of scientific methodology. It enables researchers to build with confidence on the methods and findings of others, reuse and extend computational pipelines, and thereby drive scientific…

As scientific work becomes more computational and data intensive, research processes and results become more difficult to interpret and reproduce. In this poster, we show how the Jupyter notebook, a tool originally designed as a free…

Digital Libraries · Computer Science 2018-04-17 Bernadette M. Boscoe , Irene V. Pasquetto , Milena S. Golshan , Christine L. Borgman

We report a user-friendly software environment for battery data science. It is designed to streamline data management, data cleaning, and data analysis to help bridge the gap between the domain expertise of most battery scientists and the…

Systems and Control · Electrical Eng. & Systems 2022-02-04 Robert Masse , Dan Ulery , Hardik Kamdar

Notebooks provide an interactive environment for programmers to develop code, analyse data and inject interleaved visualizations in a single environment. Despite their flexibility, a major pitfall that data scientists encounter is…

Databases · Computer Science 2021-10-27 Pavle Subotić , Lazar Milikić , Milan Stojić

Machine learning developers frequently use interactive computational notebooks, such as Jupyter notebooks, to host code for data processing and model training. Jupyter notebooks provide a convenient tool for writing machine learning…

Software Engineering · Computer Science 2025-01-17 Bihui Jin , Jiayue Wang , Pengyu Nie

In this paper, we detail the integration of Python data analysis into a first-year physics laboratory course, a task accomplished without significant alterations to the existing course structure. We introduced tailored laboratory…

Physics Education · Physics 2024-05-28 Eugenio Tufino , Stefano Oss , Micol Alemani

Nowadays, numerous industries have exceptional demand for skills in data science, such as data analysis, data mining, and machine learning. The computational notebook (e.g., Jupyter Notebook) is a well-known data science tool adopted in…

In this article we describe how we successfully incorporated data analysis in Python in a first-year laboratory course without significantly altering the course structure and without overburdening students. We show how we created and used…

Physics Education · Physics 2023-09-13 Eugenio Tufino , Stefano Oss , Micol Alemani

Software developers use metrics to evaluate code quality and productivity, but these practices are still rare in programming education. This project bridges the gap by collecting real-time learning analytics from individual student and…

The transition from AI/ML models to production-ready AI-based systems is a challenge for both data scientists and software engineers. In this paper, we report the results of a workshop conducted in a consulting company to understand how…

Software Engineering · Computer Science 2021-06-01 Filippo Lanubile , Fabio Calefato , Luigi Quaranta , Maddalena Amoruso , Fabio Fumarola , Michele Filannino
‹ Prev 1 2 3 10 Next ›