English
Related papers

Related papers: A Tool to Extract Structured Data from GitHub

200 papers

Developers use and contribute to repositories on GitHub. Documentation present in the repositories serves as an important source by helping developers to understand, maintain and contribute to the project. Currently, documentation in a…

Software Engineering · Computer Science 2021-03-02 Akhila Sri Manasa Venigalla , Sridhar Chimalakonda

The number of open source software projects has been growing exponentially. The major online software repository host, GitHub, has accumulated tens of millions of publicly available Git version-controlled repositories. Although the research…

Software Engineering · Computer Science 2018-03-28 Vadim Markovtsev , Waren Long

GitHub is the most popular social coding platform and widely used by developers and organizations to host their open-source projects around the world. Besides that, the platform has a web API that allow developers collect information from…

Software Engineering · Computer Science 2025-05-27 Hudson Silva Borges , Marco Tulio Valente

GitHub is the world's largest host of source code, with more than 150M repositories. However, most of these repositories are not labeled or inadequately so, making it harder for users to find relevant projects. There have been various…

Software Engineering · Computer Science 2023-11-21 Cezar Sas , Andrea Capiluppi , Claudio Di Sipio , Juri Di Rocco , Davide Di Ruscio

Almost every Mining Software Repositories (MSR) study requires, as first step, the selection of the subject software repositories. These repositories are usually collected from hosting services like GitHub using specific selection criteria…

Software Engineering · Computer Science 2021-03-09 Ozren Dabic , Emad Aghajani , Gabriele Bavota

Git is used as the distributed version control system for many open-source software projects. One Git-based service, GitHub, is the most common code hosting and repository service for open-source software projects. For researchers that…

Software Engineering · Computer Science 2021-01-22 Abdulkadir Şeker , Banu Diri , Halil Arslan , Mehmet Fatih Amasyalı

Software is often developed using versioned controlled software, such as Git, and hosted on centralized Web hosts, such as GitHub and GitLab. These Web hosted software repositories are made available to users in the form of traditional HTML…

Digital Libraries · Computer Science 2025-05-22 David Calano , Michele C. Weigle , Michael L. Nelson

GitHub is the largest source code repository in the world. It provides a git-based source code management platform and also many features inspired by social networks. For example, GitHub users can show appreciation to projects by adding…

Software Engineering · Computer Science 2018-08-07 Hudson Borges , Andre Hora , Marco Tulio Valente

GitHub is the largest code hosting platform, with millions of repositories spanning multiple technologies. Despite this, little is known about the actual contents of GitHub's repositories in the wild. This paper presents an initial…

Software Engineering · Computer Science 2026-05-19 Andre Hora , João Eduardo Montandon , Diego Elias Costa

Software repository hosting services contain large amounts of open-source software, with GitHub hosting more than 100 million repositories, from new to established ones. Given this vast amount of projects, there is a pressing need for a…

Software Engineering · Computer Science 2021-03-17 Cezar Sas , Andrea Capiluppi

Open-source repositories provide wealth of information and are increasingly being used to build artificial intelligence (AI) based systems to solve problems in software engineering. Open-source repositories could be of varying quality…

Software Engineering · Computer Science 2022-05-06 Niranjan Hasabnis

GitHub hosts millions of software repositories, facilitating developers to contribute to many projects in multiple ways. Most of the information about the repositories is text-based in the form of stars, forks, commits, and so on. However,…

Software Engineering · Computer Science 2022-05-03 Akhila Sri Manasa Venigalla , Kowndinya Boyalakunta , Sridhar Chimalakonda

Besides a git-based version control system, GitHub integrates several social coding features. Particularly, GitHub users can star a repository, presumably to manifest interest or satisfaction with an open source project. However, the real…

Software Engineering · Computer Science 2019-03-20 Hudson Borges , Marco Tulio Valente

GitHub is the world's largest platform for collaborative software development, with over 100 million users. GitHub is also used extensively for open data collaboration, hosting more than 800 million open data files, totaling 142 terabytes…

Machine Learning · Computer Science 2023-06-13 Anthony Cintron Roman , Kevin Xu , Arfon Smith , Jehu Torres Vega , Caleb Robinson , Juan M Lavista Ferres

Given the vast number of repositories hosted on GitHub, project discovery and retrieval have become increasingly important for GitHub users. Repository descriptions serve as one of the first points of contact for users who are accessing a…

Software Engineering · Computer Science 2021-10-27 Jazlyn Hellman , Eunbee Jang , Christoph Treude , Chenzhun Huang , Jin L. C. Guo

GitHub projects can be easily replicated through the site's fork process or through a Git clone-push sequence. This is a problem for empirical software engineering, because it can lead to skewed results or mistrained machine learning…

Software Engineering · Computer Science 2023-12-05 Diomidis Spinellis , Zoe Kotti , Audris Mockus

GitHub has become the central online platform for much of open source, hosting most open source code repositories. With this popularity, the public digital traces of GitHub are now a valuable means to study teamwork and collaboration. In…

Computers and Society · Computer Science 2022-05-24 Milo Z. Trujillo , Laurent Hébert-Dufresne , James Bagrow

GitHub's issue reports provide developers with valuable information that is essential to the evolution of a software development project. Contributors can use these reports to perform software engineering tasks like submitting bugs,…

Software Engineering · Computer Science 2023-03-22 Nafiseh Nikeghbal , Amir Hossein Kargaran , Abbas Heydarnoori , Hinrich Schütze

In open-source software development environments; textual, numerical and relationship-based data generated are of interest to researchers. Various data sets are available for this data, which is frequently used in areas such as software…

Software Engineering · Computer Science 2020-10-01 Abdulkadir Şeker , Banu Diri , Halil Arslan

[Background] In large open-source software projects, development knowledge is often fragmented across multiple artefacts and contributors such that individual stakeholders are generally unaware of the full breadth of the product features.…

Software Engineering · Computer Science 2024-08-05 Tim Puhlfürß , Lloyd Montgomery , Walid Maalej
‹ Prev 1 2 3 10 Next ›