Related papers: Collaborative Problem Solving on a Data Platform K…
Since 2010, Kaggle has been a platform where data scientists from around the world come together to compete, collaborate, and push the boundaries of Data Science. Over these 15 years, it has grown from a purely competition-focused site into…
In recent years, rather than enclosing data within a single organization, exchanging and combining data from different domains has become an emerging practice. Many studies have discussed the economic and utility value of data and data…
Competitions play an invaluable role in the field of forecasting, as exemplified through the recent M4 competition. The competition received attention from both academics and practitioners and sparked discussions around the…
Having greater access to data leads to many benefits, from advancing science to promoting accountability in government to boosting innovation. However, merely providing data access does not make data easy to use; even when data is openly…
Research collaborations are continuously emerging catalyzed by online platforms, where people can share their codes, calculations, data and results. These virtual research platforms are innovative, community oriented, flexible and secure as…
Machine learning competitions (MLCs) play a pivotal role in advancing artificial intelligence (AI) by fostering innovation, skill development, and practical problem-solving. This study provides a comprehensive analysis of major competition…
Data only generates value for a few organizations with expertise and resources to make data shareable, discoverable, and easy to integrate. Sharing data that is easy to discover and integrate is hard because data owners lack information…
Software developers are increasingly required to understand fundamental Data science (DS) concepts. Recently, the presence of machine learning (ML) and deep learning (DL) has dramatically increased in the development of user applications,…
Data science tasks involving tabular data present complex challenges that require sophisticated problem-solving approaches. We propose AutoKaggle, a powerful and user-centric framework that assists data scientists in completing daily data…
Background: Open innovation highlights the potential benefits of external collaboration and knowledge-sharing, often exemplified through Open Source Software (OSS). The public sector has thus far mainly focused on the sharing of Open…
Collaborative learning techniques have significantly advanced in recent years, enabling private model training across multiple organizations. Despite this opportunity, firms face a dilemma when considering data sharing with competitors --…
In enterprise organizations, data-driven decision making processes include the use of business intelligence dashboards and collaborative deliberation on communication platforms such as Slack. However, apart from those in data analyst roles,…
One of the most significant differences of M5 over previous forecasting competitions is that it was held on Kaggle, an online platform of data scientists and machine learning practitioners. Kaggle provides a gathering place, or virtual…
Data commons collate data with cloud computing infrastructure and commonly used software services, tools and applications to create biomedical resources for the large-scale management, analysis, harmonization, and sharing of biomedical…
The widespread development and adoption of open-source software have built an ecosystem for open development and collaboration. In this ecosystem, individuals and organizations collaborate to create high-quality software that can be used by…
A Data Ecosystem offers a keystone-player or alliance-driven infrastructure that enables the interaction of different stakeholders and the resolution of interoperability issues among shared data. However, despite years of research in data…
Machine learning (ML) algorithms are showing a growing trend in helping the scientific communities across different disciplines and institutions to address large and diverse data problems. However, many available ML tools are…
Cloud Computing is rising fast, with its data centres growing at an unprecedented rate. However, this has come with concerns of privacy, efficiency at the expense of resilience, and environmental sustainability, because of the dependence on…
Recently, data exchange platforms have emerged in the digital economy to enable better resource allocation in a data-driven society, which requires cross-organizational data collaborations. Understanding the characteristics of the data on…
This paper studies cooperative data-sharing between competitors vying to predict a consumer's tastes. We design optimal data-sharing schemes both for when they compete only with each other, and for when they additionally compete with an…