Related papers: Do We Need Improved Code Quality Metrics?

Software Code Quality Measurement: Implications from Metric Distributions

Software code quality is a construct with three dimensions: maintainability, reliability, and functionality. Although many firms have incorporated code quality metrics in their operations, evaluating these metrics still lacks consistent…

Software Engineering · Computer Science 2024-01-17 Siyuan Jin , Mianmian Zhang , Yekai Guo , Yuejiang He , Ziyuan Li , Bichao Chen , Bing Zhu , Yong Xia

ComplexCodeEval: A Benchmark for Evaluating Large Code Models on More Complex Code

In recent years, the application of large language models (LLMs) to code-related tasks has gained significant attention. However, existing evaluation benchmarks often focus on limited scenarios, such as code generation or completion, which…

Software Engineering · Computer Science 2024-09-17 Jia Feng , Jiachen Liu , Cuiyun Gao , Chun Yong Chong , Chaozheng Wang , Shan Gao , Xin Xia

A Pedagogical Evaluation and Discussion about the Lack of Cohesion in Method (LCOM) Metric Using Field Experiment

Chidamber and Kemerer first defined a cohesion measure for object-oriented software - the Lack of Cohesion in Methods (LCOM) metric. This paper presents a pedagogic evaluation and discussion about the LCOM metric using field data from three…

Software Engineering · Computer Science 2010-04-20 Ezekiel Okike

A Qualitative Investigation into LLM-Generated Multilingual Code Comments and Automatic Evaluation Metrics

Large Language Models are essential coding assistants, yet their training is predominantly English-centric. In this study, we evaluate the performance of code language models in non-English contexts, identifying challenges in their adoption…

Software Engineering · Computer Science 2025-05-22 Jonathan Katzy , Yongcheng Huang , Gopal-Raj Panchu , Maksym Ziemlewski , Paris Loizides , Sander Vermeulen , Arie van Deursen , Maliheh Izadi

Is Quantization a Deal-breaker? Empirical Insights from Large Code Models

The growing scale of large language models (LLMs) not only demands extensive computational resources but also raises environmental concerns due to their increasing carbon footprint. Model quantization emerges as an effective approach that…

Software Engineering · Computer Science 2025-07-15 Saima Afrin , Bowen Xu , Antonio Mastropaolo

Quality Assurance of LLM-generated Code: Addressing Non-Functional Quality Characteristics

In recent years, large language models have been widely integrated into software engineering workflows, supporting tasks like code generation. While prior evaluations focus on functional correctness, there is still a limited understanding…

Software Engineering · Computer Science 2026-04-23 Xin Sun , Daniel Ståhl , Kristian Sandahl , Christoph Kessler

COFFE: A Code Efficiency Benchmark for Code Generation

Code generation has largely improved development efficiency in the era of large language models (LLMs). With the ability to follow instructions, current LLMs can be prompted to generate code solutions given detailed descriptions in natural…

Software Engineering · Computer Science 2025-02-06 Yun Peng , Jun Wan , Yichen Li , Xiaoxue Ren

What really changes when developers intend to improve their source code: a commit-level study of static metric value and static analysis warning changes

Many software metrics are designed to measure aspects that are believed to be related to software quality. Static software metrics, e.g., size, complexity and coupling are used in defect prediction research as well as software quality…

Software Engineering · Computer Science 2022-05-31 Alexander Trautsch , Johannes Erbel , Steffen Herbold , Jens Grabowski

Assessing Consensus of Developers' Views on Code Readability

The rapid rise of Large Language Models (LLMs) has changed software development, with tools like Copilot, JetBrains AI Assistant, and others boosting developers' productivity. However, developers now spend more time reviewing code than…

Software Engineering · Computer Science 2024-07-08 Agnia Sergeyuk , Olga Lvova , Sergey Titov , Anastasiia Serova , Farid Bagirov , Timofey Bryksin

Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models

In recent years, researchers have proposed numerous benchmarks to evaluate the impressive coding capabilities of large language models (LLMs). However, current benchmarks primarily assess the accuracy of LLM-generated code, while neglecting…

Software Engineering · Computer Science 2024-10-10 Jiasheng Zheng , Boxi Cao , Zhengzhao Ma , Ruotong Pan , Hongyu Lin , Yaojie Lu , Xianpei Han , Le Sun

An Empirical Study on the Impact of Code Duplication-aware Refactoring Practices on Quality Metrics

Context: Code refactoring is widely recognized as an essential software engineering practice that improves the understandability and maintainability of source code. Several studies attempted to detect refactoring activities through mining…

Software Engineering · Computer Science 2025-02-07 Eman Abdullah AlOmar

Identifying Inaccurate Descriptions in LLM-generated Code Comments via Test Execution

Software comments are critical for human understanding of software, and as such many comment generation techniques have been proposed. However, we find that a systematic evaluation of the factual accuracy of generated comments is rare; only…

Software Engineering · Computer Science 2024-06-24 Sungmin Kang , Louis Milliken , Shin Yoo

Towards Resolving Software Quality-in-Use Measurement Challenges

Software quality-in-use comprehends the quality from user's perspectives. It has gained its importance in e-learning applications, mobile service based applications and project management tools. User's decisions on software acquisitions are…

Software Engineering · Computer Science 2015-02-02 Issa Atoum , Chih How Bong , Narayanan Kulathuramaiyer

Frustrated with Code Quality Issues? LLMs can Help!

As software projects progress, quality of code assumes paramount importance as it affects reliability, maintainability and security of software. For this reason, static analysis tools are used in developer workflows to flag code quality…

Artificial Intelligence · Computer Science 2023-09-25 Nalin Wadhwa , Jui Pradhan , Atharv Sonwane , Surya Prakash Sahu , Nagarajan Natarajan , Aditya Kanade , Suresh Parthasarathy , Sriram Rajamani

Evaluation of Software Product Quality Metrics

Computing devices and associated software govern everyday life, and form the backbone of safety critical systems in banking, healthcare, automotive and other fields. Increasing system complexity, quickly evolving technologies and paradigm…

Software Engineering · Computer Science 2020-09-04 Arthur-Jozsef Molnar , Alexandra Neamţu , Simona Motogna

The Fault in our Stars: Quality Assessment of Code Generation Benchmarks

Large Language Models (LLMs) are gaining popularity among software engineers. A crucial aspect of developing effective code generation LLMs is to evaluate these models using a robust benchmark. Evaluation benchmarks with quality issues can…

Software Engineering · Computer Science 2024-09-05 Mohammed Latif Siddiq , Simantika Dristi , Joy Saha , Joanna C. S. Santos

Rigor, Reliability, and Reproducibility Matter: A Decade-Scale Survey of 572 Code Benchmarks

Code-related benchmarks play a critical role in evaluating large language models (LLMs), yet their quality fundamentally shapes how the community interprets model capabilities. In the past few years, awareness of benchmark quality has…

Software Engineering · Computer Science 2026-02-10 Jialun Cao , Yuk-Kit Chan , Zixuan Ling , Wenxuan Wang , Shuqing Li , Mingwei Liu , Ruixi Qiao , Yuting Han , Chaozheng Wang , Boxi Yu , Pinjia He , Shuai Wang , Zibin Zheng , Michael R. Lyu , Shing-Chi Cheung

On Iterative Evaluation and Enhancement of Code Quality Using GPT-4o

This paper introduces CodeQUEST, a novel framework leveraging Large Language Models (LLMs) to iteratively evaluate and enhance code quality across multiple dimensions, including readability, maintainability, efficiency, and security. The…

Software Engineering · Computer Science 2025-02-12 Rundong Liu , Andre Frade , Amal Vaidya , Maxime Labonne , Marcus Kaiser , Bismayan Chakrabarti , Jonathan Budd , Sean Moran

Source Code Metrics for Software Defects Prediction

In current research, there are contrasting results about the applicability of software source code metrics as features for defect prediction models. The goal of the paper is to evaluate the adoption of software metrics in models for…

Software Engineering · Computer Science 2023-01-20 Dominik Arne Rebro , Bruno Rossi , Stanislav Chren

Rethinking Code Complexity Through the Lens of Large Language Models

Code complexity metrics such as cyclomatic complexity have long been used to assess software quality and maintainability. With the rapid advancement of large language models (LLMs) on coding tasks, an important yet underexplored question…

Software Engineering · Computer Science 2026-05-28 Chen Xie , Xiaodong Gu , Yuling Shi , Beijun Shen