English
Related papers

Related papers: Enabling Collaborative Data Science Development wi…

200 papers

The introduction of machine learning (ML) components in software projects has created the need for software engineers to collaborate with data scientists and other specialists. While collaboration can always be challenging, ML introduces…

Software Engineering · Computer Science 2022-02-14 Nadia Nahar , Shurui Zhou , Grace Lewis , Christian Kästner

With the increased interest in computational sciences, machine learning (ML), pattern recognition (PR) and big data, governmental agencies, academia and manufacturers are overwhelmed by the constant influx of new algorithms and techniques…

Software Engineering · Computer Science 2017-07-28 André Anjos , Laurent El-Shafey , Sébastien Marcel

Developers in data science and other domains frequently use computational notebooks to create exploratory analyses and prototype models. However, they often struggle to incorporate existing software engineering tooling into these…

Human-Computer Interaction · Computer Science 2021-03-30 Micah J. Smith , Jürgen Cito , Kalyan Veeramachaneni

The recent success of machine learning (ML) has led to an explosive growth both in terms of new systems and algorithms built in industry and academia, and new applications built by an ever-growing community of data science (DS)…

Data engineering is one of the fastest-growing fields within machine learning (ML). As ML becomes more common, the appetite for data grows more ravenous. But ML requires more data than individual teams of data engineers can readily produce,…

Machine Learning · Computer Science 2021-02-24 Vijay Janapa Reddi , Greg Diamos , Pete Warden , Peter Mattson , David Kanter

Relational databases have limited support for data collaboration, where teams collaboratively curate and analyze large datasets. Inspired by software version control systems like git, we propose (a) a dataset version control system, giving…

Social science research increasingly demands data-driven insights, yet researchers often face barriers such as lack of technical expertise, inconsistent data formats, and limited access to reliable datasets.Social science research…

Databases · Computer Science 2025-12-03 Puneet Arya , Ojas Sahasrabudhe , Adwaiya Srivastav , Partha Pratim Das , Maya Ramanath

There are many science applications that require scalable task-level parallelism and support for flexible execution and coupling of ensembles of simulations. Most high-performance system software and middleware, however, are designed to…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-29 Vivekanandan Balasubramanian , Antons Treikalis , Ole Weidner , Shantenu Jha

Incorporating Machine Learning (ML) into existing systems is a demand that has grown among several organizations. However, the development of ML-enabled systems encompasses several social and technical challenges, which must be addressed by…

Software Engineering · Computer Science 2024-07-23 Gabriel Busquim , Allysson Allex Araújo , Maria Julia Lima , Marcos Kalinowski

Everybody wants to analyse their data, but only few posses the data science expertise to to this. Motivated by this observation we introduce a novel framework and system \textsc{VisualSynth} for human-machine collaboration in data science.…

Artificial Intelligence · Computer Science 2020-04-24 Clément Gautrais , Yann Dauxais , Stefano Teso , Samuel Kolb , Gust Verbruggen , Luc De Raedt

High-quality data has become increasingly important to software engineers in designing and implementing today's software, for example, as an input to machine-learning algorithms and visualisation- and analytics-based features. Open data -…

Software Engineering · Computer Science 2022-08-02 Johan Linåker , Per Runeson , Anneke Zuiderwijk , Amanda Brock

Data-driven science is an emerging paradigm where scientific discoveries depend on the execution of computational AI models against rich, discipline-specific datasets. With modern machine learning frameworks, anyone can develop and execute…

Machine Learning · Computer Science 2022-08-09 Seth Ockerman , John Wu , Christopher Stewart

We present the Open MatSci ML Toolkit: a flexible, self-contained, and scalable Python-based framework to apply deep learning models and methods on scientific data with a specific focus on materials science and the OpenCatalyst Dataset. Our…

Machine Learning · Computer Science 2023-09-01 Santiago Miret , Kin Long Kelvin Lee , Carmelo Gonzales , Marcel Nassar , Matthew Spellings

Data science has employed great research efforts in developing advanced analytics, improving data models and cultivating new algorithms. However, not many authors have come across the organizational and socio-technical challenges that arise…

Machine Learning · Computer Science 2022-01-17 Iñigo Martinez , Elisabeth Viles , Igor G. Olaizola

Open Science aims to foster openness and collaboration in research, leading to more significant scientific and social impact. However, practicing Open Science comes with several challenges and is currently not properly rewarded. In this…

Software Engineering · Computer Science 2024-05-21 Edson OliveiraJr , Fernanda Madeiral , Alcemir Rodrigues Santos , Christina von Flach , Sergio Soares

The advent of data-driven science in the 21st century brought about the need for well-organized structured data and associated infrastructure able to facilitate the applications of Artificial Intelligence and Machine Learning. We present an…

Databases · Computer Science 2022-03-03 Alexander Zech , Timur Bazhirov

We use the emergent field of Complex Networks to analyze the network of scientific collaborations between entities (universities, research organizations, industry related companies,...) which collaborate in the context of the so-called…

Data Analysis, Statistics and Probability · Physics 2009-01-23 Juan A. Almendral , Joao G. Oliveira , L. López , J. F. F. Mendes , Miguel A. F. Sanjuán

In recent years, Machine Learning (ML) components have been increasingly integrated into the core systems of organizations. Engineering such systems presents various challenges from both a theoretical and practical perspective. One of the…

Software Engineering · Computer Science 2024-02-09 Gabriel Busquim , Hugo Villamizar , Maria Julia Lima , Marcos Kalinowski

Large language models (LLMs) have achieved remarkable progress across domains and applications but face challenges such as high fine-tuning costs, inference latency, limited edge deployability, and reliability concerns. Small language…

Computation and Language · Computer Science 2025-11-06 Fali Wang , Jihai Chen , Shuhua Yang , Ali Al-Lawati , Linli Tang , Hui Liu , Suhang Wang

Data science tasks involving tabular data present complex challenges that require sophisticated problem-solving approaches. We propose AutoKaggle, a powerful and user-centric framework that assists data scientists in completing daily data…

‹ Prev 1 2 3 10 Next ›