Related papers: Learning from Rules Generalizing Labeled Exemplars

Exploiting Class Learnability in Noisy Data

In many domains, collecting sufficient labeled training data for supervised machine learning requires easily accessible but noisy sources, such as crowdsourcing services or tagged Web data. Noisy labels occur frequently in data sets…

Machine Learning · Computer Science 2018-11-16 Matthew Klawonn , Eric Heim , James Hendler

Robust Learning Under Label Noise With Iterative Noise-Filtering

We consider the problem of training a model under the presence of label noise. Current approaches identify samples with potentially incorrect labels and reduce their influence on the learning process by either assigning lower weights to…

Machine Learning · Computer Science 2019-06-04 Duc Tam Nguyen , Thi-Phuong-Nhung Ngo , Zhongyu Lou , Michael Klar , Laura Beggel , Thomas Brox

Denoising Multi-Source Weak Supervision for Neural Text Classification

We study the problem of learning neural text classifiers without using any labeled data, but only easy-to-provide rules as multiple weak supervision sources. This problem is challenging because rule-induced weak labels are often noisy and…

Computation and Language · Computer Science 2021-03-12 Wendi Ren , Yinghao Li , Hanting Su , David Kartchner , Cassie Mitchell , Chao Zhang

Label Selection Approach to Learning from Crowds

Supervised learning, especially supervised deep learning, requires large amounts of labeled data. One approach to collect large amounts of labeled data is by using a crowdsourcing platform where numerous workers perform the annotation…

Machine Learning · Computer Science 2023-08-22 Kosuke Yoshimura , Hisashi Kashima

Unsupervised Feature Learning in Remote Sensing

The need for labeled data is among the most common and well-known practical obstacles to deploying deep learning algorithms to solve real-world problems. The current generation of learning algorithms requires a large volume of data labeled…

Computer Vision and Pattern Recognition · Computer Science 2019-09-24 Aaron Reite , Scott Kangas , Zackery Steck , Steven Goley , Jonathan Von Stroh , Steven Forsyth

Label Denoising through Cross-Model Agreement

Learning from corrupted labels is very common in real-world machine-learning applications. Memorizing such noisy labels could affect the learning of the model, leading to sub-optimal performances. In this work, we propose a novel framework…

Machine Learning · Computer Science 2023-12-20 Yu Wang , Xin Xin , Zaiqiao Meng , Joemon Jose , Fuli Feng

How to Train Text Summarization Model with Weak Supervisions

Currently, machine learning techniques have seen significant success across various applications. Most of these techniques rely on supervision from human-generated labels or a mixture of noisy and imprecise labels from multiple sources.…

Computation and Language · Computer Science 2024-09-04 Yanbo Wang , Wenyu Chen , Shimin Shan

Uncover the Ground-Truth Relations in Distant Supervision: A Neural Expectation-Maximization Framework

Distant supervision for relation extraction enables one to effectively acquire structured relations out of very large text corpora with less human efforts. Nevertheless, most of the prior-art models for such tasks assume that the given text…

Computation and Language · Computer Science 2019-09-13 Junfan Chen , Richong Zhang , Yongyi Mao , Hongyu Guo , Jie Xu

Efficient learning of large sets of locally optimal classification rules

Conventional rule learning algorithms aim at finding a set of simple rules, where each rule covers as many examples as possible. In this paper, we argue that the rules found in this way may not be the optimal explanations for each of the…

Machine Learning · Computer Science 2023-01-27 Van Quoc Phuong Huynh , Johannes Fürnkranz , Florian Beck

Deep Explainable Learning with Graph Based Data Assessing and Rule Reasoning

Learning an explainable classifier often results in low accuracy model or ends up with a huge rule set, while learning a deep model is usually more capable of handling noisy data at scale, but with the cost of hard to explain the result and…

Artificial Intelligence · Computer Science 2022-11-11 Yuanlong Li , Gaopan Huang , Min Zhou , Chuan Fu , Honglin Qiao , Yan He

Learning to Impute: A General Framework for Semi-supervised Learning

Recent semi-supervised learning methods have shown to achieve comparable results to their supervised counterparts while using only a small portion of labels in image classification tasks thanks to their regularization strategies. In this…

Machine Learning · Computer Science 2020-09-25 Wei-Hong Li , Chuan-Sheng Foo , Hakan Bilen

Learning by Association - A versatile semi-supervised training method for neural networks

In many real-world scenarios, labeled data for a specific machine learning task is costly to obtain. Semi-supervised training methods make use of abundantly available unlabeled data and a smaller number of labeled examples. We propose a new…

Computer Vision and Pattern Recognition · Computer Science 2017-06-06 Philip Häusser , Alexander Mordvintsev , Daniel Cremers

Learning to Learn in a Semi-Supervised Fashion

To address semi-supervised learning from both labeled and unlabeled data, we present a novel meta-learning scheme. We particularly consider that labeled and unlabeled data share disjoint ground truth label sets, which can be seen tasks like…

Computer Vision and Pattern Recognition · Computer Science 2020-08-26 Yun-Chun Chen , Chao-Te Chou , Yu-Chiang Frank Wang

Distilling Rule-based Knowledge into Large Language Models

Large language models (LLMs) have shown incredible performance in completing various real-world tasks. The current paradigm of knowledge learning for LLMs is mainly based on learning from examples, in which LLMs learn the internal rule…

Computation and Language · Computer Science 2024-12-17 Wenkai Yang , Yankai Lin , Jie Zhou , Ji-Rong Wen

Probabilistic Decoupling of Labels in Classification

We investigate probabilistic decoupling of labels supplied for training, from the underlying classes for prediction. Decoupling enables an inference scheme general enough to implement many classification problems, including supervised,…

Machine Learning · Computer Science 2019-05-30 Jeppe Nørregaard , Lars Kai Hansen

Learning to Denoise Distantly-Labeled Data for Entity Typing

Distantly-labeled data can be used to scale up training of statistical models, but it is typically noisy and that noise can vary with the distant labeling technique. In this work, we propose a two-stage procedure for handling this type of…

Computation and Language · Computer Science 2019-05-07 Yasumasa Onoe , Greg Durrett

Learning From Noisy Singly-labeled Data

Supervised learning depends on annotated examples, which are taken to be the \emph{ground truth}. But these labels often come from noisy crowdsourcing platforms, like Amazon Mechanical Turk. Practitioners typically collect multiple labels…

Machine Learning · Computer Science 2018-05-22 Ashish Khetan , Zachary C. Lipton , Anima Anandkumar

On information captured by neural networks: connections with memorization and generalization

Despite the popularity and success of deep learning, there is limited understanding of when, how, and why neural networks generalize to unseen examples. Since learning can be seen as extracting information from data, we formally study…

Machine Learning · Computer Science 2023-06-29 Hrayr Harutyunyan

Learning with Neighbor Consistency for Noisy Labels

Recent advances in deep learning have relied on large, labelled datasets to train high-capacity models. However, collecting large datasets in a time- and cost-efficient manner often results in label noise. We present a method for learning…

Computer Vision and Pattern Recognition · Computer Science 2022-07-07 Ahmet Iscen , Jack Valmadre , Anurag Arnab , Cordelia Schmid

Data Consistency for Weakly Supervised Learning

In many applications, training machine learning models involves using large amounts of human-annotated data. Obtaining precise labels for the data is expensive. Instead, training with weak supervision provides a low-cost alternative. We…

Machine Learning · Computer Science 2022-02-09 Chidubem Arachie , Bert Huang