English
Related papers

Related papers: Binary Classification with XOR Queries: Fundamenta…

200 papers

This paper models the crowdsourced labeling/classification problem as a sparsely encoded source coding problem, where each query answer, regarded as a code bit, is the XOR of a small number of labels, as source information bits. In this…

Machine Learning · Statistics 2020-02-03 James Chin-Jen Pang , Hessam Mahdavifar , S. Sandeep Pradhan

We study the problem of clustering a set of items from binary user feedback. Such a problem arises in crowdsourcing platforms solving large-scale labeling tasks with minimal effort put on the users. For example, in some of the recent…

Machine Learning · Statistics 2024-12-20 Kaito Ariu , Jungseul Ok , Alexandre Proutiere , Se-Young Yun

The inability to linearly classify XOR has motivated much of deep learning. We revisit this age-old problem and show that linear classification of XOR is indeed possible. Instead of separating data between halfspaces, we propose a slightly…

Machine Learning · Computer Science 2024-06-21 Matthew Lau , Ismaila Seck , Athanasios P Meliopoulos , Wenke Lee , Eugene Ndiaye

Consider a query-based data acquisition problem that aims to recover the values of $k$ binary variables from parity (XOR) measurements of chosen subsets of the variables. Assume the response model where only a randomly selected subset of…

Information Theory · Computer Science 2019-11-11 Hye Won Chung , Ji Oon Lee , Doyeon Kim , Alfred O. Hero

We investigate the problem of classification in the presence of unknown class-conditional label noise in which the labels observed by the learner have been corrupted with some unknown class dependent probability. In order to obtain finite…

Machine Learning · Statistics 2019-06-11 Henry W J Reeve , Ata Kaban

Noisy $k$-XOR is a basic average-case inference problem in which one observes random noisy $k$-ary parity constraints and seeks to recover, or more weakly, detect, a hidden Boolean assignment. A central question is to characterize the…

Computational Complexity · Computer Science 2026-04-14 Songtao Mao

Emerging applications of sensor networks for detection sometimes suggest that classical problems ought be revisited under new assumptions. This is the case of binary hypothesis testing with independent - but not necessarily identically…

Information Theory · Computer Science 2019-03-27 Stefano Marano , Peter Willett

We consider the problem of cost-optimal utilization of a crowdsourcing platform for binary, unsupervised classification of a collection of items, given a prescribed error threshold. Workers on the crowdsourcing platform are assumed to be…

Machine Learning · Computer Science 2022-07-06 Yashvardhan Didwania , Jayakrishnan Nair , N. Hemachandra

Learning with label dependent label noise has been extensively explored in both theory and practice; however, dealing with instance (i.e., feature) and label dependent label noise continues to be a challenging task. The difficulty arises…

Machine Learning · Statistics 2023-06-07 Hyungki Im , Paul Grigas

We consider the problem of estimating how well a model class is capable of fitting a distribution of labeled data. We show that it is often possible to accurately estimate this "learnability" even when given an amount of data that is too…

Machine Learning · Computer Science 2019-03-26 Weihao Kong , Gregory Valiant

Crowdsourcing system has emerged as an effective platform for labeling data with relatively low cost by using non-expert workers. Inferring correct labels from multiple noisy answers on data, however, has been a challenging problem, since…

Human-Computer Interaction · Computer Science 2023-09-14 Doyeon Kim , Jeonghwan Lee , Hye Won Chung

Image classification systems recently made a giant leap with the advancement of deep neural networks. However, these systems require an excessive amount of labeled data to be adequately trained. Gathering a correctly annotated dataset is…

Machine Learning · Computer Science 2021-01-19 Görkem Algan , Ilkay Ulusoy

We consider the unsupervised learning problem of assigning labels to unlabeled data. A naive approach is to use clustering methods, but this works well only when data is properly clustered and each cluster corresponds to an underlying…

Machine Learning · Computer Science 2013-05-02 Marthinus Christoffel du Plessis , Masashi Sugiyama

To collect large scale annotated data, it is inevitable to introduce label noise, i.e., incorrect class labels. To be robust against label noise, many successful methods rely on the noisy classifiers (i.e., models trained on the noisy…

Computer Vision and Pattern Recognition · Computer Science 2020-11-23 Songzhu Zheng , Pengxiang Wu , Aman Goswami , Mayank Goswami , Dimitris Metaxas , Chao Chen

With the explosion of massive, widely available unlabeled data in the past years, finding label and time efficient, robust learning algorithms has become ever more important in theory and in practice. We study the paradigm of active…

Machine Learning · Computer Science 2020-01-17 Max Hopkins , Daniel Kane , Shachar Lovett , Gaurav Mahajan

The evaluation of noisy binary classifiers on unlabeled data is treated as a streaming task: given a data sketch of the decisions by an ensemble, estimate the true prevalence of the labels as well as each classifier's accuracy on them. Two…

Machine Learning · Statistics 2023-09-11 Andrés Corrada-Emmanuel

Deep Metric Learning (DML) plays a critical role in various machine learning tasks. However, most existing deep metric learning methods with binary similarity are sensitive to noisy labels, which are widely present in real-world data. Since…

Computer Vision and Pattern Recognition · Computer Science 2021-11-02 Jiexi Yan , Lei Luo , Cheng Deng , Heng Huang

We consider crowdsourced labeling under a $d$-type worker-task specialization model, where each worker and task is associated with one particular type among a finite set of types and a worker provides a more reliable answer to tasks of the…

Human-Computer Interaction · Computer Science 2021-06-10 Doyeon Kim , Hye Won Chung

A common approach in positive-unlabeled learning is to train a classification model between labeled and unlabeled data. This strategy is in fact known to give an optimal classifier under mild conditions; however, it results in biased…

Machine Learning · Statistics 2017-02-03 Shantanu Jain , Martha White , Predrag Radivojac

This work considers the problem of the noisy binary search in a sorted array. The noise is modeled by a parameter $p$ that dictates that a comparison can be incorrect with probability $p$, independently of other queries. We state two types…

Data Structures and Algorithms · Computer Science 2025-02-27 Dariusz Dereniowski , Aleksander Łukasiewicz , Przemysław Uznański
‹ Prev 1 2 3 10 Next ›