English
Related papers

Related papers: Nested-Wasserstein Self-Imitation Learning for Seq…

200 papers

Sequence generation with reinforcement learning (RL) has received significant attention recently. However, a challenge with such methods is the sparse-reward problem in the RL training process, in which a scalar guiding signal is often only…

Computation and Language · Computer Science 2018-11-05 Ruiyi Zhang , Changyou Chen , Zhe Gan , Wenlin Wang , Liqun Chen , Dinghan Shen , Guoyin Wang , Lawrence Carin

Applying Reinforcement Learning (RL) to sequence generation models enables the direct optimization of long-term rewards (\textit{e.g.,} BLEU and human feedback), but typically requires large-scale sampling over a space of action sequences.…

Computation and Language · Computer Science 2023-08-07 Chenglong Wang , Hang Zhou , Yimin Hu , Yifu Huo , Bei Li , Tongran Liu , Tong Xiao , Jingbo Zhu

Self-paced reinforcement learning (RL) aims to improve the data efficiency of learning by automatically creating sequences, namely curricula, of probability distributions over contexts. However, existing techniques for self-paced RL fail in…

Machine Learning · Computer Science 2023-05-29 Cevahir Koprulu , Ufuk Topcu

Despite the success of sequence-to-sequence approaches in automatic speech recognition (ASR) systems, the models still suffer from several problems, mainly due to the mismatch between the training and inference conditions. In the…

Computation and Language · Computer Science 2018-03-01 Andros Tjandra , Sakriani Sakti , Satoshi Nakamura

Reinforcement learning (RL) has shown its strength in challenging sequential decision-making problems. The reward function in RL is crucial to the learning performance, as it serves as a measure of the task completion degree. In real-world…

Machine Learning · Computer Science 2024-02-13 Siyuan Li , Shijie Han , Yingnan Zhao , By Liang , Peng Liu

A novel optimization approach is proposed for application to policy gradient methods and evolution strategies for reinforcement learning (RL). The procedure uses a computationally efficient Wasserstein natural gradient (WNG) descent that…

Machine Learning · Computer Science 2021-03-19 Ted Moskovitz , Michael Arbel , Ferenc Huszar , Arthur Gretton

Offline reinforcement learning (RL) aims to learn an optimal policy from a static dataset, making it particularly valuable in scenarios where data collection is costly, such as robotics. A major challenge in offline RL is distributional…

Machine Learning · Computer Science 2025-07-16 Motoki Omura , Yusuke Mukuta , Kazuki Ota , Takayuki Osa , Tatsuya Harada

The empirical success of distributional reinforcement learning (RL) highly relies on the choice of distribution divergence equipped with an appropriate distribution representation. In this paper, we propose \textit{Sinkhorn distributional…

Machine Learning · Computer Science 2024-10-16 Ke Sun , Yingnan Zhao , Wulong Liu , Bei Jiang , Linglong Kong

Reinforcement Learning (RL) is a computational approach to reward-driven learning in sequential decision problems. It implements the discovery of optimal actions by learning from an agent interacting with an environment rather than from…

Methodology · Statistics 2022-10-06 Mauricio Tec , Yunshan Duan , Peter Müller

Recent advancements in reinforcement learning (RL) have achieved great success in fine-tuning diffusion-based generative models. However, fine-tuning continuous flow-based generative models to align with arbitrary user-defined reward…

Machine Learning · Computer Science 2025-02-11 Jiajun Fan , Shuaike Shen , Chaoran Cheng , Yuxin Chen , Chumeng Liang , Ge Liu

Attention-based sequential recommendation methods have shown promise in accurately capturing users' evolving interests from their past interactions. Recent research has also explored the integration of reinforcement learning (RL) into these…

Machine Learning · Computer Science 2024-04-19 Melissa Mozifian , Tristan Sylvain , Dave Evans , Lili Meng

Sequence-to-sequence models are commonly trained via maximum likelihood estimation (MLE). However, standard MLE training considers a word-level objective, predicting the next word given the previous ground-truth partial sentence. This…

Computation and Language · Computer Science 2019-01-21 Liqun Chen , Yizhe Zhang , Ruiyi Zhang , Chenyang Tao , Zhe Gan , Haichao Zhang , Bai Li , Dinghan Shen , Changyou Chen , Lawrence Carin

Maximum likelihood estimation (MLE) is the predominant algorithm for training text generation models. This paradigm relies on direct supervision examples, which is not applicable to many emerging applications, such as generating adversarial…

Computation and Language · Computer Science 2022-10-25 Han Guo , Bowen Tan , Zhengzhong Liu , Eric P. Xing , Zhiting Hu

Recent advances have demonstrated the effectiveness of Reinforcement Learning (RL) in improving the reasoning capabilities of Large Language Models (LLMs). However, existing works inevitably rely on high-quality instructions and verifiable…

Computation and Language · Computer Science 2026-01-27 Wenkai Fang , Shunyu Liu , Yang Zhou , Kongcheng Zhang , Tongya Zheng , Kaixuan Chen , Mingli Song , Dacheng Tao

Reinforcement learning (RL) has been effective for post-training autoregressive (AR) language models, but extending these methods to diffusion language models (DLMs) is challenging due to intractable sequence-level likelihoods. Existing…

Learning with an objective to minimize the mismatch with a reference distribution has been shown to be useful for generative modeling and imitation learning. In this paper, we investigate whether one such objective, the Wasserstein-1…

Machine Learning · Computer Science 2021-10-29 Ishan Durugkar , Mauricio Tec , Scott Niekum , Peter Stone

Reinforcement learning (RL) is an effective approach to learn an optimal dialog policy for task-oriented visual dialog systems. A common practice is to apply RL on a neural sequence-to-sequence (seq2seq) framework with the action space…

Computation and Language · Computer Science 2019-10-30 Mingyang Zhou , Josh Arnold , Zhou Yu

Named Entity Recognition (NER) is a well and widely studied task in natural language processing. Recently, the nested NER has attracted more attention since its practicality and difficulty. Existing works for nested NER ignore the…

Computation and Language · Computer Science 2023-05-15 Yawen Yang , Xuming Hu , Fukun Ma , Shu'ang Li , Aiwei Liu , Lijie Wen , Philip S. Yu

The majority of language model training builds on imitation learning. It covers pretraining, supervised fine-tuning, and affects the starting conditions for reinforcement learning from human feedback (RLHF). The simplicity and scalability…

Optimal Transport has sparked vivid interest in recent years, in particular thanks to the Wasserstein distance, which provides a geometrically sensible and intuitive way of comparing probability measures. For computational reasons, the…

Machine Learning · Computer Science 2024-03-19 Eloi Tanguy
‹ Prev 1 2 3 10 Next ›