English
Related papers

Related papers: Reference Distance Estimator

200 papers

Recent advancements in semi-supervised deep learning have introduced effective strategies for leveraging both labeled and unlabeled data to improve classification performance. This work proposes a semi-supervised framework that utilizes a…

Machine Learning · Computer Science 2025-05-21 Aydin Abedinia , Shima Tabakhi , Vahid Seydi

In this paper, we study statistical properties of semi-supervised learning, which is considered as an important problem in the community of machine learning. In the standard supervised learning, only the labeled data is observed. The…

Machine Learning · Statistics 2012-04-19 Masanori Kawakita , Takafumi Kanamori

Most existing distance metric learning approaches use fully labeled data to learn the sample similarities in an embedding space. We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional…

Computer Vision and Pattern Recognition · Computer Science 2021-03-31 Jiali Duan , Yen-Liang Lin , Son Tran , Larry S. Davis , C. -C. Jay Kuo

Semi-supervised learning methods are motivated by the availability of large datasets with unlabeled features in addition to labeled data. Unlabeled data is, however, not guaranteed to improve classification performance and has in fact been…

Machine Learning · Statistics 2019-10-25 Xiuming Liu , Dave Zachariah , Johan Wågberg , Thomas B. Schön

Semisupervised methods inevitably invoke some assumption that links the marginal distribution of the features to the regression function of the label. Most commonly, the cluster or manifold assumptions are used which imply that the…

Statistics Theory · Mathematics 2011-12-02 Martin Azizyan , Aarti Singh , Larry Wasserman

How many labeled examples are needed to estimate a classifier's performance on a new dataset? We study the case where data is plentiful, but labels are expensive. We show that by making a few reasonable assumptions on the structure of the…

Machine Learning · Computer Science 2012-10-09 Peter Welinder , Max Welling , Pietro Perona

Distant supervision provides a means to create a large number of weakly labeled data at low cost for relation classification. However, the resulting labeled instances are very noisy, containing data with wrong labels. Many approaches have…

Computation and Language · Computer Science 2020-10-27 Zhenzhen Li , Jian-Yun Nie , Benyou Wang , Pan Du , Yuhan Zhang , Lixin Zou , Dongsheng Li

In this paper, we investigate the problem of classifying feature vectors with mutually independent but non-identically distributed elements. First, we show the importance of this problem. Next, we propose a classifier and derive an…

Machine Learning · Computer Science 2021-09-01 Farzad Shahrivari , Nikola Zlatanov

In this work we consider the task of relaxing the i.i.d assumption in pattern recognition (or classification), aiming to make existing learning algorithms applicable to a wider range of tasks. Pattern recognition is guessing a discrete…

Machine Learning · Computer Science 2012-02-28 Daniil Ryabko

Distance-based unsupervised text classification is a method within text classification that leverages the semantic similarity between a label and a text to determine label relevance. This method provides numerous benefits, including fast…

Computation and Language · Computer Science 2025-10-14 Jens Van Nooten , Andriy Kosar , Guy De Pauw , Walter Daelemans

It remains difficult to evaluate machine learning classifiers in the absence of a large, labeled dataset. While labeled data can be prohibitively expensive or impossible to obtain, unlabeled data is plentiful. Here, we introduce…

Machine Learning · Computer Science 2025-10-15 Divya Shanmugam , Shuvom Sadhuka , Manish Raghavan , John Guttag , Bonnie Berger , Emma Pierson

The task of node classification is to infer unknown node labels, given the labels for some of the nodes along with the network structure and other node attributes. Typically, approaches for this task assume homophily, whereby neighboring…

Social and Information Networks · Computer Science 2021-09-15 Arpit Merchant , Michael Mathioudakis

Learning to rank -- producing a ranked list of items specific to a query and with respect to a set of supervisory items -- is a problem of general interest. The setting we consider is one in which no analytic description of what constitutes…

Compared to supervised learning, semi-supervised learning reduces the dependence of deep learning on a large number of labeled samples. In this work, we use a small number of labeled samples and perform data augmentation on unlabeled…

Machine Learning · Computer Science 2020-01-14 Qiuyu Zhu , Tiantian Li

Unsupervised text embedding methods, such as Skip-gram and Paragraph Vector, have been attracting increasing attention due to their simplicity, scalability, and effectiveness. However, comparing to sophisticated deep learning architectures…

Computation and Language · Computer Science 2015-08-04 Jian Tang , Meng Qu , Qiaozhu Mei

In various situations one is given only the predictions of multiple classifiers over a large unlabeled test data. This scenario raises the following questions: Without any labeled data and without any a-priori knowledge about the…

Machine Learning · Statistics 2014-10-31 Ariel Jaffe , Boaz Nadler , Yuval Kluger

A straightforward application of semi-supervised machine learning to the problem of treatment effect estimation would be to consider data as "unlabeled" if treatment assignment and covariates are observed but outcomes are unobserved.…

Methodology · Statistics 2020-09-15 Andrew Herren , P. Richard Hahn

Multi-label classification is a type of supervised learning where an instance may belong to multiple labels simultaneously. Predicting each label independently has been criticized for not exploiting any correlation between labels. In this…

Machine Learning · Statistics 2023-10-25 Hyukjun Gweon , Matthias Schonlau , Stefan Steiner

Semi-supervised learning is a setting in which one has labeled and unlabeled data available. In this survey we explore different types of theoretical results when one uses unlabeled data in classification and regression tasks. Most methods…

Machine Learning · Computer Science 2020-07-31 Alexander Mey , Marco Loog

Existing semi-supervised learning (SSL) algorithms use a single weight to balance the loss of labeled and unlabeled examples, i.e., all unlabeled examples are equally weighted. But not all unlabeled data are equal. In this paper we study…

Machine Learning · Computer Science 2020-10-30 Zhongzheng Ren , Raymond A. Yeh , Alexander G. Schwing
‹ Prev 1 2 3 10 Next ›