English
Related papers

Related papers: LabelGit: A Dataset for Software Repositories Clas…

200 papers

Many platforms exploit collaborative tagging to provide their users with faster and more accurate results while searching or navigating. Tags can communicate different concepts such as the main features, technologies, functionality, and the…

Software Engineering · Computer Science 2021-06-15 Maliheh Izadi , Abbas Heydarnoori , Georgios Gousios

GitHub has become an important platform for code sharing and scientific exchange. With the massive number of repositories available, there is a pressing need for topic-based search. Even though the topic label functionality has been…

Machine Learning · Computer Science 2023-10-24 Yu Zhang , Frank F. Xu , Sha Li , Yu Meng , Xuan Wang , Qi Li , Jiawei Han

GitHub is the world's largest host of source code, with more than 150M repositories. However, most of these repositories are not labeled or inadequately so, making it harder for users to find relevant projects. There have been various…

Software Engineering · Computer Science 2023-11-21 Cezar Sas , Andrea Capiluppi , Claudio Di Sipio , Juri Di Rocco , Davide Di Ruscio

GitHub is the largest host of open source software on the Internet. This large, freely accessible database has attracted the attention of practitioners and researchers alike. But as GitHub's growth continues, it is becoming increasingly…

Software Engineering · Computer Science 2022-08-02 Francisco Zanartu , Christoph Treude , Bruno Cartaxo , Hudson Silva Borges , Pedro Moura , Markus Wagner , Gustavo Pinto

Communities on GitHub often use issue labels as a way of triaging issues by assigning them priority ratings based on how urgently they should be addressed. The labels used are determined by the repository contributors and not standardised…

Software Engineering · Computer Science 2024-05-20 James Caddy , Christoph Treude

README files play an essential role in shaping a developer's first impression of a software repository and in documenting the software project that the repository hosts. Yet, we lack a systematic understanding of the content of a typical…

Software Engineering · Computer Science 2018-07-31 Gede Artha Azriadi Prana , Christoph Treude , Ferdian Thung , Thushari Atapattu , David Lo

GitHub is the most widely used platform for software maintenance in the open-source community. Developers report issues on GitHub from time to time while facing difficulties. Having labels on those issues can help developers easily address…

Software Engineering · Computer Science 2025-07-28 Amir Hossain Raaj , Fairuz Nawer Meem , Sadia Afrin Mim

GitHub repositories consist of various detailed information about the project contributors, the number of commits and its contributors, releases, pull requests, programming languages, and issues. However, no systematic dataset of open…

Software Engineering · Computer Science 2020-12-08 Shreyansh Surana , Smit Detroja , Saurabh Tiwari

App reviews reflect various user requirements that can aid in planning maintenance tasks. Recently, proposed approaches for automatically classifying user reviews rely on machine learning algorithms. A previous study demonstrated that…

Software Engineering · Computer Science 2025-07-15 Yasaman Abedini , Abbas Heydarnoori

Developers collaboratively discuss, implement, use, and share software entities hosted on software repositories. Proper documentation plays an important role in successful software management and maintenance. Users exploit Issue Tracking…

Software Engineering · Computer Science 2021-09-29 Maliheh Izadi , Kiana Akbari , Abbas Heydarnoori

In a wave of growth, open-source projects need to modernize and change how they deal with processes, methods, and communication with their contributors. We could observe that open-source projects are constantly evolving to improve their…

Software Engineering · Computer Science 2021-10-05 Joselito Júnior , Gláucya Boechat , Ivan Machado

The number of open source software projects has been growing exponentially. The major online software repository host, GitHub, has accumulated tens of millions of publicly available Git version-controlled repositories. Although the research…

Software Engineering · Computer Science 2018-03-28 Vadim Markovtsev , Waren Long

Git is used as the distributed version control system for many open-source software projects. One Git-based service, GitHub, is the most common code hosting and repository service for open-source software projects. For researchers that…

Software Engineering · Computer Science 2021-01-22 Abdulkadir Şeker , Banu Diri , Halil Arslan , Mehmet Fatih Amasyalı

GitHub hosts millions of software repositories, facilitating developers to contribute to many projects in multiple ways. Most of the information about the repositories is text-based in the form of stars, forks, commits, and so on. However,…

Software Engineering · Computer Science 2022-05-03 Akhila Sri Manasa Venigalla , Kowndinya Boyalakunta , Sridhar Chimalakonda

Open-source repositories provide wealth of information and are increasingly being used to build artificial intelligence (AI) based systems to solve problems in software engineering. Open-source repositories could be of varying quality…

Software Engineering · Computer Science 2022-05-06 Niranjan Hasabnis

We present stack graphs, an extension of Visser et al.'s scope graphs framework. Stack graphs power Precise Code Navigation at GitHub, allowing users to navigate name binding references both within and across repositories. Like scope…

Programming Languages · Computer Science 2023-05-10 Douglas A. Creager , Hendrik van Antwerpen

We introduce GraphSL, a new library for studying the graph source localization problem. graph diffusion and graph source localization are inverse problems in nature: graph diffusion predicts information diffusions from information sources,…

Machine Learning · Computer Science 2024-07-30 Junxiang Wang , Liang Zhao

Relational databases have limited support for data collaboration, where teams collaboratively curate and analyze large datasets. Inspired by software version control systems like git, we propose (a) a dataset version control system, giving…

In order to understand the state and evolution of the entirety of open source software we need to get a handle on the set of distinct software projects. Most of open source projects presently utilize Git, which is a distributed version…

Social and Information Networks · Computer Science 2020-04-07 Audris Mockus , Diomidis Spinellis , Zoe Kotti , Gabriel John Dusing

Modern open source software development heavily relies on the issue tracking systems to manage their feature requests, bug reports, tasks, and other similar artifacts. Together, those "issues" form a complex network with links to each…

Software Engineering · Computer Science 2021-08-11 Alexander Nicholson , Jin L. C. Guo
‹ Prev 1 2 3 10 Next ›