Related papers: Automatic Bug Triage using Semi-Supervised Text Cl…

Towards Training Set Reduction for Bug Triage

Bug triage is an important step in the process of bug fixing. The goal of bug triage is to assign a new-coming bug to the correct potential developer. The existing bug triage approaches are based on machine learning algorithms, which build…

Software Engineering · Computer Science 2017-03-14 Weiqin Zou , Yan Hu , Jifeng Xuan , He Jiang

A cost-reducing partial labeling estimator in text classification problem

We propose a new approach to address the text classification problems when learning with partial labels is beneficial. Instead of offering each training sample a set of candidate labels, we assign negative-oriented labels to the ambiguous…

Machine Learning · Statistics 2019-06-11 Jiangning Chen , Zhibo Dai , Juntao Duan , Qianli Hu , Ruilin Li , Heinrich Matzinger , Ionel Popescu , Haoyan Zhai

Automatic Classification of Bug Reports Based on Multiple Text Information and Reports' Intention

With the rapid growth of software scale and complexity, a large number of bug reports are submitted to the bug tracking system. In order to speed up defect repair, these reports need to be accurately classified so that they can be sent to…

Software Engineering · Computer Science 2022-08-03 Fanqi Meng , Xuesong Wang , Jingdong Wang , Peifang Wang

Reliable Semi-Supervised Learning when Labels are Missing at Random

Semi-supervised learning methods are motivated by the availability of large datasets with unlabeled features in addition to labeled data. Unlabeled data is, however, not guaranteed to improve classification performance and has in fact been…

Machine Learning · Statistics 2019-10-25 Xiuming Liu , Dave Zachariah , Johan Wågberg , Thomas B. Schön

Semi-supervised Learning with Sparse Autoencoders in Phone Classification

We propose the application of a semi-supervised learning method to improve the performance of acoustic modelling for automatic speech recognition based on deep neural net- works. As opposed to unsupervised initialisation followed by…

Machine Learning · Statistics 2016-10-04 Akash Kumar Dhaka , Giampiero Salvi

Semi-supervised Classification for Natural Language Processing

Semi-supervised classification is an interesting idea where classification models are learned from both labeled and unlabeled data. It has several advantages over supervised classification in natural language processing domain. For…

Computation and Language · Computer Science 2014-09-29 Rushdi Shams

Document Classification Using Expectation Maximization with Semi Supervised Learning

As the amount of online document increases, the demand for document classification to aid the analysis and management of document is increasing. Text is cheap, but information, in the form of knowing what classes a document belongs to, is…

Information Retrieval · Computer Science 2011-12-12 Bhawna Nigam , Poorvi Ahirwal , Sonal Salve , Swati Vamney

Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data

Most of the semi-supervised classification methods developed so far use unlabeled data for regularization purposes under particular distributional assumptions such as the cluster assumption. In contrast, recently developed methods of…

Machine Learning · Computer Science 2017-06-19 Tomoya Sakai , Marthinus Christoffel du Plessis , Gang Niu , Masashi Sugiyama

Semi-supervised Classification: Cluster and Label Approach using Particle Swarm Optimization

Classification predicts classes of objects using the knowledge learned during the training phase. This process requires learning from labeled samples. However, the labeled samples usually limited. Annotation process is annoying, tedious,…

Machine Learning · Computer Science 2017-06-06 Shahira Shaaban Azab , Mohamed Farouk Abdel Hady , Hesham Ahmed Hefny

Predicting Bugs' Components via Mining Bug Reports

The number of bug reports in complex software increases dramatically. Now bugs are triaged manually, bug triage or assignment is a labor-intensive and time-consuming task. Without knowledge about the structure of the software, testers often…

Software Engineering · Computer Science 2012-06-07 Deqing Wang , Hui Zhang , Rui Liu , Mengxiang Lin , Wenjun Wu , Hongping Hu

Semi-supervised Wrapper Feature Selection by Modeling Imperfect Labels

In this paper, we propose a new wrapper feature selection approach with partially labeled training examples where unlabeled observations are pseudo-labeled using the predictions of an initial classifier trained on the labeled training set.…

Machine Learning · Computer Science 2020-03-11 Vasilii Feofanov , Emilie Devijver , Massih-Reza Amini

Likelihood-based semi-supervised model selection with applications to speech processing

In conventional supervised pattern recognition tasks, model selection is typically accomplished by minimizing the classification error rate on a set of so-called development data, subject to ground-truth labeling by human experts or some…

Machine Learning · Statistics 2011-08-25 Christopher M. White , Sanjeev P. Khudanpur , Patrick J. Wolfe

Unsupervised Label Refinement Improves Dataless Text Classification

Dataless text classification is capable of classifying documents into previously unseen labels by assigning a score to any document paired with a label description. While promising, it crucially relies on accurate descriptions of the label…

Computation and Language · Computer Science 2020-12-09 Zewei Chu , Karl Stratos , Kevin Gimpel

How to Achieve High Classification Accuracy with Just a Few Labels: A Semi-supervised Approach Using Sampled Packets

Network traffic classification, which has numerous applications from security to billing and network provisioning, has become a cornerstone of today's computer networks. Previous studies have developed traffic classification techniques…

Networking and Internet Architecture · Computer Science 2020-05-19 Shahbaz Rezaei , Xin Liu

Improved Naive Bayes with Mislabeled Data

Labeling mistakes are frequently encountered in real-world applications. If not treated well, the labeling mistakes can deteriorate the classification performances of a model seriously. To address this issue, we propose an improved Naive…

Machine Learning · Computer Science 2023-04-14 Qianhan Zeng , Yingqiu Zhu , Xuening Zhu , Feifei Wang , Weichen Zhao , Shuning Sun , Meng Su , Hansheng Wang

Semi-Supervised Text Classification via Self-Pretraining

We present a neural semi-supervised learning model termed Self-Pretraining. Our model is inspired by the classic self-training algorithm. However, as opposed to self-training, Self-Pretraining is threshold-free, it can potentially update…

Computation and Language · Computer Science 2021-10-01 Payam Karisani , Negin Karisani

Deep Positive-Unlabeled Anomaly Detection for Contaminated Unlabeled Data

Semi-supervised anomaly detection, which aims to improve the anomaly detection performance by using a small amount of labeled anomaly data in addition to unlabeled data, has attracted attention. Existing semi-supervised approaches assume…

Machine Learning · Statistics 2025-02-11 Hiroshi Takahashi , Tomoharu Iwata , Atsutoshi Kumagai , Yuuki Yamanaka

Self-Contrastive Learning based Semi-Supervised Radio Modulation Classification

This paper presents a semi-supervised learning framework that is new in being designed for automatic modulation classification (AMC). By carefully utilizing unlabeled signal data with a self-supervised contrastive-learning pre-training…

Machine Learning · Computer Science 2022-03-31 Dongxin Liu , Peng Wang , Tianshi Wang , Tarek Abdelzaher

Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification

Semi-supervised learning (SSL) is a common approach to learning predictive models using not only labeled examples, but also unlabeled examples. While SSL for the simple tasks of classification and regression has received a lot of attention…

Machine Learning · Computer Science 2024-04-02 Jurica Levatić , Michelangelo Ceci , Dragi Kocev , Sašo Džeroski

Bayesian Semi-supervised Multi-category Classification under Nonparanormality

Semi-supervised learning is a model training method that uses both labeled and unlabeled data. This paper proposes a fully Bayes semi-supervised learning algorithm that can be applied to any multi-category classification problem. We assume…

Machine Learning · Statistics 2024-07-22 Rui Zhu , Shuvrarghya Ghosh , Subhashis Ghosal