English
Related papers

Related papers: A Benchmark Generative Probabilistic Model for Wea…

200 papers

Reward models (RM) capture the values and preferences of humans and play a central role in Reinforcement Learning with Human Feedback (RLHF) to align pretrained large language models (LLMs). Traditionally, training these models relies on…

Machine Learning · Computer Science 2024-09-12 Yifei He , Haoxiang Wang , Ziyan Jiang , Alexandros Papangelis , Han Zhao

The lack of labeled data is a common challenge in speech classification tasks, particularly those requiring extensive subjective assessment, such as cognitive state classification. In this work, we propose a Semi-Supervised Learning (SSL)…

Audio and Speech Processing · Electrical Eng. & Systems 2025-05-01 Yuanchao Li , Zixing Zhang , Jing Han , Peter Bell , Catherine Lai

Semi- and weakly-supervised learning have recently attracted considerable attention in the object detection literature since they can alleviate the cost of annotation needed to successfully train deep learning models. State-of-art…

Computer Vision and Pattern Recognition · Computer Science 2022-06-20 Akhil Meethal , Marco Pedersoli , Zhongwen Zhu , Francisco Perdigon Romero , Eric Granger

Weak supervision is a popular framework for overcoming the labeled data bottleneck: the need to obtain labels for training data. In weak supervision, multiple noisy-but-cheap sources are used to provide guesses of the label and are…

Machine Learning · Computer Science 2025-02-19 Tzu-Heng Huang , Catherine Cao , Spencer Schoenberg , Harit Vishwakarma , Nicholas Roberts , Frederic Sala

Weakly supervised learning with scribble annotations uses sparse user-drawn strokes to indicate segmentation labels on a small subset of pixels. This annotation reduces the cost of dense pixel-wise labeling, but suffers inherently from…

Computer Vision and Pattern Recognition · Computer Science 2026-02-13 Yeva Gabrielyan , Varduhi Yeghiazaryan , Irina Voiculescu

Real-world data is frequently noisy and ambiguous. In crowdsourcing, for example, human annotators may assign conflicting class labels to the same instances. Partial-label learning (PLL) addresses this challenge by training classifiers when…

Machine Learning · Computer Science 2026-01-12 Tobias Fuchs , Nadja Klein

Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations. Unlike semi-supervised learning, one cannot select the most…

Machine Learning · Computer Science 2024-12-30 Jia-Hao Xiao , Ming-Kun Xie , Heng-Bo Fan , Gang Niu , Masashi Sugiyama , Sheng-Jun Huang

In supervised machine learning, models are typically trained using data with hard labels, i.e., definite assignments of class membership. This traditional approach, however, does not take the inherent uncertainty in these labels into…

Machine Learning · Computer Science 2024-09-25 Sjoerd de Vries , Dirk Thierens

Pseudo-labels are confident predictions made on unlabeled target data by a classifier trained on labeled source data. They are widely used for adapting a model to unlabeled data, e.g., in a semi-supervised learning setting. Our key insight…

Machine Learning · Computer Science 2022-04-22 Xudong Wang , Zhirong Wu , Long Lian , Stella X. Yu

The lightweight semi-supervised learning (LSL) strategy provides an effective approach of conserving labeled samples and minimizing model inference costs. Prior research has effectively applied knowledge transfer learning and co-training…

Semi-supervised few-shot learning (SSFSL) formulates real-world applications like ''auto-annotation'', as it aims to learn a model over a few labeled and abundant unlabeled examples to annotate the unlabeled ones. Despite the availability…

Computer Vision and Pattern Recognition · Computer Science 2025-12-12 Tian Liu , Anwesha Basu , James Caverlee , Shu Kong

Partial label learning (PLL) is a class of weakly supervised learning where each training instance consists of a data and a set of candidate labels containing a unique ground truth label. To tackle this problem, a majority of current…

Machine Learning · Computer Science 2021-02-09 Junghoon Seo , Joon Suk Huh

Labeling training data is one of the most costly bottlenecks in developing machine learning-based applications. We present a first-of-its-kind study showing how existing knowledge resources from across an organization can be used as weak…

Software vulnerability detection has emerged as a significant concern in the field of software security recently, capturing the attention of numerous researchers and developers. Most previous approaches focus on coarse-grained vulnerability…

Software Engineering · Computer Science 2025-09-16 Wenchao Gu , Yupan Chen , Yanlin Wang , Hongyu Zhang , Cuiyun Gao , Michael R. Lyu

We present a general methodology for using unlabeled data to design semi supervised learning (SSL) variants of the Empirical Risk Minimization (ERM) learning process. Focusing on generalized linear regression, we analyze of the…

Machine Learning · Statistics 2022-03-08 Oren Yuval , Saharon Rosset

Effective document reranking is essential for improving search relevance across diverse applications. While Large Language Models (LLMs) excel at reranking due to their deep semantic understanding and reasoning, their high computational…

Computation and Language · Computer Science 2025-10-03 Dimitar Peshevski , Kiril Blazhevski , Martin Popovski , Gjorgji Madjarov

We propose a novel semi-supervised learning (SSL) method that adopts selective training with pseudo labels. In our method, we generate hard pseudo-labels and also estimate their confidence, which represents how likely each pseudo-label is…

Machine Learning · Computer Science 2021-03-16 Masato Ishii

The past few years have witnessed a remarkable advance in deep learning for EEG-based sleep stage classification (SSC). However, the success of these models is attributed to possessing a massive amount of labeled data for training, limiting…

Signal Processing · Electrical Eng. & Systems 2022-10-14 Emadeldeen Eldele , Mohamed Ragab , Zhenghua Chen , Min Wu , Chee-Keong Kwoh , Xiaoli Li

State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks. Weak supervision in the form of domain-specific rules has been shown to be useful in such…

Computation and Language · Computer Science 2021-04-13 Giannis Karamanolakis , Subhabrata Mukherjee , Guoqing Zheng , Ahmed Hassan Awadallah

NLP benchmarks rely on standardized datasets for training and evaluating models and are crucial for advancing the field. Traditionally, expert annotations ensure high-quality labels; however, the cost of expert annotation does not scale…

Computation and Language · Computer Science 2025-09-15 Omer Nahum , Nitay Calderon , Orgad Keller , Idan Szpektor , Roi Reichart
‹ Prev 1 8 9 10 Next ›