Related papers: Active Data Discovery: Mining Unknown Data using S…
During recent years, active learning has evolved into a popular paradigm for utilizing user's feedback to improve accuracy of learning algorithms. Active learning works by selecting the most informative sample among unlabeled data and…
The great success that deep models have achieved in the past is mainly owed to large amounts of labeled training data. However, the acquisition of labeled data for new tasks aside from existing benchmarks is both challenging and costly.…
Active learning has proven to be useful for minimizing labeling costs by selecting the most informative samples. However, existing active learning methods do not work well in realistic scenarios such as imbalance or rare classes,…
Many active learning methods belong to the retraining-based approaches, which select one unlabeled instance, add it to the training set with its possible labels, retrain the classification model, and evaluate the criteria that we base our…
State-of-the-art machine learning models require access to significant amount of annotated data in order to achieve the desired level of performance. While unlabelled data can be largely available and even abundant, annotation process can…
In critical machine learning applications, ensuring fairness is essential to avoid perpetuating social inequities. In this work, we address the challenges of reducing bias and improving accuracy in data-scarce environments, where the cost…
Generating labeled training datasets has become a major bottleneck in Machine Learning (ML) pipelines. Active ML aims to address this issue by designing learning algorithms that automatically and adaptively select the most informative…
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator. Current active learning techniques either rely on model uncertainty to select the most uncertain…
Supervised machine learning relies on the availability of good labelled data for model training. Labelled data is acquired by human annotation, which is a cumbersome and costly process, often requiring subject matter experts. Active…
In supervised learning, acquiring labeled training data for a predictive model can be very costly, but acquiring a large amount of unlabeled data is often quite easy. Active learning is a method of obtaining predictive models with high…
The availability of large labeled datasets is the key component for the success of deep learning. However, annotating labels on large datasets is generally time-consuming and expensive. Active learning is a research area that addresses the…
Anomalies are intuitively easy for human experts to understand, but they are hard to define mathematically. Therefore, in order to have performance guarantees in unsupervised anomaly detection, priors need to be assumed on what the…
We consider the problem of wisely using a limited budget to label a small subset of a large unlabeled dataset. We are motivated by the NLP problem of word sense disambiguation. For any word, we have a set of candidate labels from a…
Active learning is of great interest for many practical applications, especially in industry and the physical sciences, where there is a strong need to minimize the number of costly experiments necessary to train predictive models. However,…
Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry and pose the challenges of not having adequate computing resources and of high costs involved in human labeling efforts. Training data…
Active learning focuses on choosing a subset of unlabeled data to be labeled. However, most such methods assume that a large subset of the data can be annotated. We are interested in low-budget active learning where only a small subset…
Classification is an important task in many fields including biomedical research and machine learning. Traditionally, a classification rule is constructed based a bunch of labeled data. Recently, due to technological innovation and…
Active learning emerged as an alternative to alleviate the effort to label huge amount of data for data hungry applications (such as image/video indexing and retrieval, autonomous driving, etc.). The goal of active learning is to…
Active learning typically focuses on training a model on few labeled examples alone, while unlabeled ones are only used for acquisition. In this work we depart from this setting by using both labeled and unlabeled data during model training…
Subspace clustering is a growing field of unsupervised learning that has gained much popularity in the computer vision community. Applications can be found in areas such as motion segmentation and face clustering. It assumes that data…