Related papers: Smoothing Advantage Learning

Robust Action Gap Increasing with Clipped Advantage Learning

Advantage Learning (AL) seeks to increase the action gap between the optimal action and its competitors, so as to improve the robustness to estimation errors. However, the method becomes problematic when the optimal action induced by the…

Machine Learning · Computer Science 2022-03-23 Zhe Zhang , Yaozhong Gan , Xiaoyang Tan

Smooth Sailing: Improving Active Learning for Pre-trained Language Models with Representation Smoothness Analysis

Developed to alleviate prohibitive labeling costs, active learning (AL) methods aim to reduce label complexity in supervised learning. While recent work has demonstrated the benefit of using AL in combination with large pre-trained language…

Machine Learning · Computer Science 2023-10-24 Josip Jukić , Jan Šnajder

Self-Imitation Advantage Learning

Self-imitation learning is a Reinforcement Learning (RL) method that encourages actions whose returns were higher than expected, which helps in hard exploration and sparse reward problems. It was shown to improve the performance of…

Machine Learning · Computer Science 2020-12-23 Johan Ferret , Olivier Pietquin , Matthieu Geist

Practical Obstacles to Deploying Active Learning

Active learning (AL) is a widely-used training strategy for maximizing predictive performance subject to a fixed annotation budget. In AL one iteratively selects training examples for annotation, often those for which the current model is…

Machine Learning · Computer Science 2019-11-05 David Lowell , Zachary C. Lipton , Byron C. Wallace

Stable Adversarial Learning under Distributional Shifts

Machine learning algorithms with empirical risk minimization are vulnerable under distributional shifts due to the greedy adoption of all the correlations found in training data. Recently, there are robust learning methods aiming at this…

Machine Learning · Computer Science 2021-05-12 Jiashuo Liu , Zheyan Shen , Peng Cui , Linjun Zhou , Kun Kuang , Bo Li , Yishi Lin

SALR: Sharpness-aware Learning Rate Scheduler for Improved Generalization

In an effort to improve generalization in deep learning and automate the process of learning rate scheduling, we propose SALR: a sharpness-aware learning rate update technique designed to recover flat minimizers. Our method dynamically…

Machine Learning · Computer Science 2023-07-04 Xubo Yue , Maher Nouiehed , Raed Al Kontar

Smoothed Action Value Functions for Learning Gaussian Policies

State-action value functions (i.e., Q-values) are ubiquitous in reinforcement learning (RL), giving rise to popular algorithms such as SARSA and Q-learning. We propose a new notion of action value defined by a Gaussian smoothed version of…

Machine Learning · Computer Science 2018-07-26 Ofir Nachum , Mohammad Norouzi , George Tucker , Dale Schuurmans

Active Learning with Effective Scoring Functions for Semi-Supervised Temporal Action Localization

Temporal Action Localization (TAL) aims to predict both action category and temporal boundary of action instances in untrimmed videos, i.e., start and end time. Fully-supervised solutions are usually adopted in most existing works, and…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 Ding Li , Xuebing Yang , Yongqiang Tang , Chenyang Zhang , Wensheng Zhang

Support-weighted Adversarial Imitation Learning

Adversarial Imitation Learning (AIL) is a broad family of imitation learning methods designed to mimic expert behaviors from demonstrations. While AIL has shown state-of-the-art performance on imitation learning with only small number of…

Machine Learning · Computer Science 2020-02-21 Ruohan Wang , Carlo Ciliberto , Pierluigi Amadori , Yiannis Demiris

Distributionally Robust Learning with Stable Adversarial Training

Machine learning algorithms with empirical risk minimization are vulnerable under distributional shifts due to the greedy adoption of all the correlations found in training data. There is an emerging literature on tackling this problem by…

Machine Learning · Computer Science 2022-11-22 Jiashuo Liu , Zheyan Shen , Peng Cui , Linjun Zhou , Kun Kuang , Bo Li

Active Learning for Regression by Inverse Distance Weighting

This paper proposes an active learning (AL) algorithm to solve regression problems based on inverse-distance weighting functions for selecting the feature vectors to query. The algorithm has the following features: (i) supports both…

Machine Learning · Computer Science 2022-12-15 Alberto Bemporad

Adversarial Sampling for Active Learning

This paper proposes asal, a new GAN based active learning method that generates high entropy samples. Instead of directly annotating the synthetic samples, ASAL searches similar samples from the pool and includes them for training. Hence,…

Machine Learning · Computer Science 2019-12-24 Christoph Mayer , Radu Timofte

SAL: Selective Adaptive Learning for Backpropagation-Free Training with Sparsification

Standard deep learning relies on Backpropagation (BP), which is constrained by biologically implausible weight symmetry and suffers from significant gradient interference within dense representations. To mitigate these bottlenecks, we…

Machine Learning · Computer Science 2026-01-30 Fanping Liu , Hua Yang , Jiasi Zou

Adaptive Mixing of Auxiliary Losses in Supervised Learning

In several supervised learning scenarios, auxiliary losses are used in order to introduce additional information or constraints into the supervised learning objective. For instance, knowledge distillation aims to mimic outputs of a powerful…

Machine Learning · Computer Science 2022-12-08 Durga Sivasubramanian , Ayush Maheshwari , Pradeep Shenoy , Prathosh AP , Ganesh Ramakrishnan

Class Adaptive Network Calibration

Recent studies have revealed that, beyond conventional accuracy, calibration should also be considered for training modern deep neural networks. To address miscalibration during learning, some methods have explored different penalty…

Computer Vision and Pattern Recognition · Computer Science 2023-04-13 Bingyuan Liu , Jérôme Rony , Adrian Galdran , Jose Dolz , Ismail Ben Ayed

SALT: Step-level Advantage Assignment for Long-horizon Agents via Trajectory Graph

Large Language Models (LLMs) have demonstrated remarkable capabilities, enabling language agents to excel at single-turn tasks. However, their application to complex, multi-step, and long-horizon tasks remains challenging. While…

Machine Learning · Computer Science 2025-10-24 Jiazheng Li , Yawei Wang , David Yan , Yijun Tian , Zhichao Xu , Huan Song , Panpan Xu , Lin Lee Cheong

Successive Over Relaxation Q-Learning

In a discounted reward Markov Decision Process (MDP), the objective is to find the optimal value function, i.e., the value function corresponding to an optimal policy. This problem reduces to solving a functional equation known as the…

Machine Learning · Computer Science 2019-06-17 Chandramouli Kamanchi , Raghuram Bharadwaj Diddigi , Shalabh Bhatnagar

Amortized Active Learning for Nonparametric Functions

Active learning (AL) is a sequential learning scheme aiming to select the most informative data. AL reduces data consumption and avoids the cost of labeling large amounts of data. However, AL trains the model and solves an acquisition…

Machine Learning · Computer Science 2025-01-13 Cen-You Li , Marc Toussaint , Barbara Rakitsch , Christoph Zimmer

SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation

When function approximation is used, solving the Bellman optimality equation with stability guarantees has remained a major open problem in reinforcement learning for decades. The fundamental difficulty is that the Bellman operator may…

Machine Learning · Computer Science 2018-06-07 Bo Dai , Albert Shaw , Lihong Li , Lin Xiao , Niao He , Zhen Liu , Jianshu Chen , Le Song

ARM: Advantage Reward Modeling for Long-Horizon Manipulation

Long-horizon robotic manipulation remains challenging for reinforcement learning (RL) because sparse rewards provide limited guidance for credit assignment. Practical policy improvement thus relies on richer intermediate supervision, such…

Robotics · Computer Science 2026-04-22 Yiming Mao , Zixi Yu , Weixin Mao , Yinhao Li , Qirui Hu , Zihan Lan , Minzhao Zhu , Hua Chen