English

GitHub OSS Governance File Dataset

Software Engineering 2023-04-04 v1

Abstract

Open-source Software (OSS) has become a valuable resource in both industry and academia over the last few decades. Despite the innovative structures they develop to support the projects, OSS projects and their communities have complex needs and face risks such as getting abandoned. To manage the internal social dynamics and community evolution, OSS developer communities have started relying on written governance documents that assign roles and responsibilities to different community actors. To facilitate the study of the impact and effectiveness of formal governance documents on OSS projects and communities, we present a longitudinal dataset of 710 GitHub-hosted OSS projects with \path{GOVERNANCE.MD} governance files. This dataset includes all commits made to the repository, all issues and comments created on GitHub, and all revisions made to the governance file. We hope its availability will foster more research interest in studying how OSS communities govern their projects and the impact of governance files on communities.

Keywords

Cite

@article{arxiv.2304.00460,
  title  = {GitHub OSS Governance File Dataset},
  author = {Yibo Yan and Seth Frey and Amy Zhang and Vladimir Filkov and Likang Yin},
  journal= {arXiv preprint arXiv:2304.00460},
  year   = {2023}
}

Comments

5 pages, 1 figure, 1 table, to be published in MSR 2023 Data and Tool Showcase Track