Related papers: Fostering better coding practices for data scienti…
Computational methods and associated software implementations are central to every field of scientific investigation. Modern biological research, particularly within systems biology, has relied heavily on the development of software tools…
Scientists spend an increasing amount of time building and using software. However, most scientists are never taught how to do this efficiently. As a result, many are unaware of tools and practices that would allow them to write more…
Scientific software-defined as computer programs, scripts, or code used in scientific research, data analysis, modeling, or simulation-has become central to modern research. However, there is limited research on the readability and…
Like medicine, psychology, or education, data science is fundamentally an applied discipline, with most students who receive advanced degrees in the field going on to work on practical problems. Unlike these disciplines, however, data…
In the data science courses at the University of British Columbia, we define data science as the study, development and practice of reproducible and auditable processes to obtain insight from data. While reproducibility is core to our…
Coding standards and good practices are fundamental to a disciplined approach to software projects, whatever programming languages they employ. Prolog programming can benefit from such an approach, perhaps more than programming in other…
Demand for data science education is surging and traditional courses offered by statistics departments are not meeting the needs of those seeking training. This has led to a number of opinion pieces advocating for an update to the…
With the goal of identifying common practices in data science projects, this paper proposes a framework for logging and understanding incremental code executions in Jupyter notebooks. This framework aims to allow reasoning about how…
Scientific software often presents very particular requirements regarding usability, which is often completely overlooked in this setting. As computational science has emerged as its own discipline, distinct from theoretical and…
The world is becoming increasingly complex, both in terms of the rich sources of data we have access to as well as in terms of the statistical and computational methods we can use on those data. These factors create an ever-increasing risk…
Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data…
Background: Meeting the growing industry demand for Data Science requires cross-disciplinary teams that can translate machine learning research into production-ready code. Software engineering teams value adherence to coding standards as an…
A growing number of students are completing undergraduate degrees in statistics and entering the workforce as data analysts. In these positions, they are expected to understand how to utilize databases and other data warehouses, scrape data…
While AI coding tools have demonstrated potential to accelerate software development, their use in scientific computing raises critical questions about code quality and scientific validity. In this paper, we provide ten practical rules for…
Computational biology continues to spread into new fields, becoming more accessible to researchers trained in the wet lab who are eager to take advantage of growing datasets, falling costs, and novel assays that present new opportunities…
Coding is a fundamental skill required in the engineering discipline, and much work exists exploring better ways of teaching coding in the higher education context. In particular, Code Snippets (CSs) are approved to be an effective way of…
Should one teach coding in a required introductory statistics and data science class for non-major students? Many professors advise against it, considering it a distraction from the important and challenging statistical topics that need to…
Statistics students need to develop the capacity to make sense of the staggering amount of information collected in our increasingly data-centered world. Data science is an important part of modern statistics, but our introductory and…
Teaching data science presents unique challenges and opportunities that cannot be fully addressed by simply borrowing pedagogical strategies from its parent disciplines of statistics and computer science. Here, we present ten simple rules…
Statistics is running the risk of appearing irrelevant to today's undergraduate students. Today's undergraduate students are familiar with data science projects and they judge statistics against what they have seen. Statistics, especially…