Related papers: How Scale Affects Structure in Java Programs

The Distribution of Program Sizes and Its Implications: An Eclipse Case Study

A large software system is often composed of many inter-related programs of different sizes. Using the public Eclipse dataset, we replicate our previous study on the distribution of program sizes. Our results confirm that the program sizes…

Software Engineering · Computer Science 2009-05-15 Hongyu Zhang , Hee Beng Kuan Tan , Michele Marchesi

An Empirical Study on Maintainable Method Size in Java

Code metrics have been widely used to estimate software maintenance effort. Metrics have generally been used to guide developer effort to reduce or avoid future maintenance burdens. Size is the simplest and most widely deployed metric. The…

Software Engineering · Computer Science 2022-05-05 Shaiful Alam Chowdhury , Gias Uddin , Reid Holmes

An Empirical Study of the Relationships between Code Readability and Software Complexity

Code readability and software complexity are important software quality metrics that impact other software metrics such as maintainability, reusability, portability and reliability. This paper presents an empirical study of the…

Software Engineering · Computer Science 2019-09-05 Duaa Alawad , Manisha Panta , Minhaz Zibran , Md Rakibul Islam

An Empirical Study on Method-Level Performance Evolution in Open-Source Java Projects

Performance is a critical quality attribute in software development, yet the impact of method-level code changes on performance evolution remains poorly understood. While developers often make intuitive assumptions about which types of…

Software Engineering · Computer Science 2025-08-12 Kaveh Shahedi , Nana Gyambrah , Heng Li , Maxime Lamothe , Foutse Khomh

Does class size matter? An in-depth assessment of the effect of class size in software defect prediction

In the past 20 years, defect prediction studies have generally acknowledged the effect of class size on software prediction performance. To quantify the relationship between object-oriented (OO) metrics and defects, modelling has to take…

Software Engineering · Computer Science 2021-06-10 Amjed Tahir , Kwabena E. Bennin , Xun Xiao , Stephen G. MacDonell

On the Nature of Code Cloning in Open-Source Java Projects

Code cloning plays a very important role in open-source software engineering. The presence of clones within a project may indicate a need for refactoring, and clones between projects are even more interesting, since code migration takes…

Software Engineering · Computer Science 2021-08-16 Yaroslav Golubev , Timofey Bryksin

Hierarchical Small Worlds in Software Architecture

In this paper, we present a complex network approach to the study of software engineering. We have found universal network patterns in a large collection of object-oriented (OO) software systems written in C++ and Java. All the systems…

Disordered Systems and Neural Networks · Physics 2009-09-29 Sergi Valverde , Ricard V. Sole

What really changes when developers intend to improve their source code: a commit-level study of static metric value and static analysis warning changes

Many software metrics are designed to measure aspects that are believed to be related to software quality. Static software metrics, e.g., size, complexity and coupling are used in defect prediction research as well as software quality…

Software Engineering · Computer Science 2022-05-31 Alexander Trautsch , Johannes Erbel , Steffen Herbold , Jens Grabowski

Metrics for Assessing The Design of Software Interfaces

Recent studies have largely investigated the detection of class design anomalies. They proposed a large set of metrics that help in detecting those anomalies and in predicting the quality of class design. While those studies and the…

Software Engineering · Computer Science 2013-02-13 Hani Abdeen , Osama Shata

Modularity Index Metrics for Java-Based Open Source Software Projects

Open Source Software (OSS) Projects are gaining popularity these days, and they become alternatives in building software system. Despite many failures in these projects, there are some success stories with one of the identified success…

Software Engineering · Computer Science 2013-09-24 Andi Wahju Rahardjo Emanuel , Retantyo Wardoyo , Jazi Eko Istiyanto , Khabib Mustofa

An ensemble meta-estimator to predict source code testability

Unlike most other software quality attributes, testability cannot be evaluated solely based on the characteristics of the source code. The effectiveness of the test suite and the budget assigned to the test highly impact the testability of…

Software Engineering · Computer Science 2022-08-25 Morteza Zakeri-Nasrabadi , Saeed Parsa

Quantifying the Capabilities of LLMs across Scale and Precision

Scale is often attributed as one of the factors that cause an increase in the performance of LLMs, resulting in models with billion and trillion parameters. One of the limitations of such large models is the high computational requirements…

Machine Learning · Computer Science 2024-05-09 Sher Badshah , Hassan Sajjad

The Correlation among Software Complexity Metrics with Case Study

People demand for software quality is growing increasingly, thus different scales for the software are growing fast to handle the quality of software. The software complexity metric is one of the measurements that use some of the internal…

Software Engineering · Computer Science 2014-08-21 Yahya Tashtoush , Mohammed Al-Maolegi , Bassam Arkok

Source Code Metrics for Software Defects Prediction

In current research, there are contrasting results about the applicability of software source code metrics as features for defect prediction models. The goal of the paper is to evaluate the adoption of software metrics in models for…

Software Engineering · Computer Science 2023-01-20 Dominik Arne Rebro , Bruno Rossi , Stanislav Chren

Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions

Improvements in language model capabilities are often attributed to increasing model size or training data, but in some cases smaller models trained on curated data or with different architectural decisions can outperform larger ones…

Computation and Language · Computer Science 2026-03-03 Emmy Liu , Amanda Bertsch , Lintang Sutawika , Lindia Tjuatja , Patrick Fernandes , Lara Marinov , Michael Chen , Shreya Singhal , Carolin Lawrence , Aditi Raghunathan , Kiril Gashteovski , Graham Neubig

Making refactoring decisions in large-scale Java systems: an empirical stance

Decisions on which classes to refactor are fraught with difficulty. The problem of identifying candidate classes becomes acute when confronted with large systems comprising hundreds or thousands of classes. In this paper, we describe a…

Software Engineering · Computer Science 2007-05-23 Richard Wheeldon , Steve Counsell

Megadiff: A Dataset of 600k Java Source Code Changes Categorized by Diff Size

This paper presents Megadiff, a dataset of source code diffs. It focuses on Java, with strict inclusion criteria based on commit message and diff size. Megadiff contains 663 029 Java diffs that can be used for research on commit…

Software Engineering · Computer Science 2021-08-11 Martin Monperrus , Matias Martinez , He Ye , Fernanda Madeiral , Thomas Durieux , Zhongxing Yu

Signatures of small-world and scale-free properties in large computer programs

A large computer program is typically divided into many hundreds or even thousands of smaller units, whose logical connections define a network in a natural way. This network reflects the internal structure of the program, and defines the…

Disordered Systems and Neural Networks · Physics 2009-11-10 Alessandro P. S. de Moura , Ying-Cheng Lai , Adilson E. Motter

Empirical Evidence of Large-Scale Diversity in API Usage of Object-Oriented Software

In this paper, we study how object-oriented classes are used across thousands of software packages. We concentrate on "usage diversity'", defined as the different statically observable combinations of methods called on the same object. We…

Software Engineering · Computer Science 2018-07-06 Diego Mendez , Benoit Baudry , Martin Monperrus

On the diversity and frequency of code related to mathematical formulas in real-world Java projects

In this paper, the term formula code refers to fragments of source code that implement a mathematical formula. We present empirical studies that analyze the diversity and frequency of formula code in open-source-software projects. In an…

Software Engineering · Computer Science 2020-11-30 Oliver Moseler , Felix Lemmer , Sebastian Baltes , Stephan Diehl