Related papers: Optimizing Deep Learning Models to Address Class I…
Code comments provide important information for understanding the source code. They can help developers understand the overall purpose of a function or class, as well as identify bugs and technical debt. However, an overabundance of…
Code comments are a key resource for information about software artefacts. Depending on the use case, only some types of comments are useful. Thus, automatic approaches to classify these comments have been proposed. In this work, we address…
Class-imbalance is one of the major challenges in real world datasets, where a few classes (called majority classes) constitute much more data samples than the rest (called minority classes). Learning deep neural networks using such…
Imbalanced datasets pose a considerable challenge in training deep learning (DL) models for medical diagnostics, particularly for segmentation tasks. Imbalance may be associated with annotation quality limited annotated datasets, rare…
Deep learning algorithms can fare poorly when the training dataset suffers from heavy class-imbalance but the testing criterion requires good generalization on less frequent classes. We design two novel methods to improve performance in…
Small class-imbalanced datasets, common in many high-level semantic tasks like discourse analysis, present a particular challenge to current deep-learning architectures. In this work, we perform an extensive analysis on sentence-level…
Training of deep neural networks heavily depends on the data distribution. In particular, the networks easily suffer from class imbalance. The trained networks would recognize the frequent classes better than the infrequent classes. To…
Vulnerability detection is crucial to protect software security. Nowadays, deep learning (DL) is the most promising technique to automate this detection task, leveraging its superior ability to extract patterns and representations within…
Class imbalance, which is also called long-tailed distribution, is a common problem in classification tasks based on machine learning. If it happens, the minority data will be overwhelmed by the majority, which presents quite a challenge…
In this paper, we propose a balancing training method to address problems in imbalanced data learning. To this end, we derive a new loss used in the balancing training phase that alleviates the influence of samples that cause an overfitted…
Imbalanced datasets are commonplace in modern machine learning problems. The presence of under-represented classes or groups with sensitive attributes results in concerns about generalization and fairness. Such concerns are further…
Code comments serve a crucial role in software development for documenting functionality, clarifying design choices, and assisting with issue tracking. They capture developers' insights about the surrounding source code, serving as an…
Learning classifiers using skewed or imbalanced datasets can occasionally lead to classification issues; this is a serious issue. In some cases, one class contains the majority of examples while the other, which is frequently the more…
Many natural language processing (NLP) tasks are naturally imbalanced, as some target categories occur much more frequently than others in the real world. In such scenarios, current NLP models still tend to perform poorly on less frequent…
With the ever increasing data deluge and the success of deep neural networks, the research of distributed deep learning has become pronounced. Two common approaches to achieve this distributed learning is synchronous and asynchronous weight…
Improving the classification of multi-class imbalanced data is more difficult than its two-class counterpart. In this paper, we use deep neural networks to train new representations of tabular multi-class data. Unlike the typically…
Code review is a crucial practice in software development. As code review nowadays is lightweight, various issues can be identified, and sometimes, they can be trivial. Research has investigated automated approaches to classify review…
Class imbalance is a fundamental problem in computer vision applications such as semantic segmentation. Specifically, uneven class distributions in a training dataset often result in unsatisfactory performance on under-represented classes.…
Natural language comments convey key aspects of source code such as implementation, usage, and pre- and post-conditions. Failure to update comments accordingly when the corresponding code is modified introduces inconsistencies, which is…
Class imbalance in binary classification tasks remains a significant challenge in machine learning, often resulting in poor performance on minority classes. This study comprehensively evaluates three widely-used strategies for handling…