Related papers: Boosting Classifiers with Noisy Inference
Boosting is known to be sensitive to label noise. We studied two approaches to improve AdaBoost's robustness against labelling errors. One is to employ a label-noise robust classifier as a base learner, while the other is to modify the…
In boosting, we aim to leverage multiple weak learners to produce a strong learner. At the center of this paradigm lies the concept of building the strong learner as a voting classifier, which outputs a weighted majority vote of the weak…
Boosting is a general method of generating many simple classification rules and combining them into a single, highly accurate rule. In this talk, I will review the AdaBoost boosting algorithm and some of its underlying theory, and then look…
Well-known for its simplicity and effectiveness in classification, AdaBoost, however, suffers from overfitting when class-conditional distributions have significant overlap. Moreover, it is very sensitive to noise that appears in the…
Boosting is an extremely successful idea, allowing one to combine multiple low accuracy classifiers into a much more accurate voting classifier. In this work, we present a new and surprisingly simple Boosting algorithm that obtains a…
Gradient boosting remains a strong and widely used method for tabular data learning, but its performance often degrades when training labels are noisy. This behavior is largely related to the way boosting algorithms emphasize samples with…
Boosting is a key method in statistical learning, allowing for converting weak learners into strong ones. While well studied in the realizable case, the statistical properties of weak-to-strong learning remain less understood in the…
Given a learning task where the data is distributed among several parties, communication is one of the fundamental resources which the parties would like to minimize. We present a distributed boosting algorithm which is resilient to a…
A lot of approaches, each following a different strategy, have been proposed in the literature to provide AdaBoost with cost-sensitive properties. In the first part of this series of two papers, we have presented these algorithms in a…
Boosting algorithms have been widely used to tackle a plethora of problems. In the last few years, a lot of approaches have been proposed to provide standard AdaBoost with cost-sensitive capabilities, each with a different focus. However,…
We explore contemporary robust classification algorithms for overcoming class-dependant labelling noise: Forward, Importance Re-weighting and T-revision. The classifiers are trained and evaluated on class-conditional random label noise data…
Vote-boosting is a sequential ensemble learning method in which the individual classifiers are built on different weighted versions of the training data. To build a new classifier, the weight of each training instance is determined in terms…
Suppose we have a weak learning algorithm $\mathcal{A}$ for a Boolean-valued problem: $\mathcal{A}$ produces hypotheses whose bias $\gamma$ is small, only slightly better than random guessing (this could, for instance, be due to…
We offer a novel view of AdaBoost in a statistical setting. We propose a Bayesian model for binary classification in which label noise is modeled hierarchically. Using variational inference to optimize a dynamic evidence lower bound, we…
Large datasets in NLP suffer from noisy labels, due to erroneous automatic and human annotation procedures. We study the problem of text classification with label noise, and aim to capture this noise through an auxiliary noise model over…
Noisy labels in large E-commerce product data (i.e., product items are placed into incorrect categories) are a critical issue for product categorization task because they are unavoidable, non-trivial to remove and degrade prediction…
The principle of boosting in supervised learning involves combining multiple weak classifiers to obtain a stronger classifier. AdaBoost has the reputation to be a perfect example of this approach. This study analyzes the (two classes)…
We consider the problem of learning from distributed data in the agnostic setting, i.e., in the presence of arbitrary forms of noise. Our main contribution is a general distributed boosting-based procedure for learning an arbitrary concept…
AdaBoost is a classic boosting algorithm for combining multiple inaccurate classifiers produced by a weak learner, to produce a strong learner with arbitrarily high accuracy when given enough training data. Determining the optimal number of…
Manually labeled corpora are expensive to create and often not available for low-resource languages or domains. Automatic labeling approaches are an alternative way to obtain labeled data in a quicker and cheaper way. However, these labels…