Related papers: ConstrainedSQL: Training LLMs for Text2SQL via Con…

Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL

Text-to-SQL is a challenging task involving multiple reasoning-intensive subtasks, including natural language understanding, database schema comprehension, and precise SQL query formulation. Existing approaches often rely on handcrafted…

Machine Learning · Computer Science 2025-04-02 Mohammadreza Pourreza , Shayan Talaei , Ruoxi Sun , Xingchen Wan , Hailong Li , Azalia Mirhoseini , Amin Saberi , Sercan "O. Arik

Arctic-Text2SQL-R1: Simple Rewards, Strong Reasoning in Text-to-SQL

Translating natural language into SQL (Test2SQL) is a longstanding challenge at the intersection of natural language understanding and structured data access. While large language models (LLMs) have significantly improved fluency in SQL…

Computation and Language · Computer Science 2026-01-14 Zhewei Yao , Guoheng Sun , Lukasz Borchmann , Gaurav Nuti , Zheyu Shen , Minghang Deng , Bohan Zhai , Hao Zhang , Ang Li , Yuxiong He

Reinforcement Learning with $\omega$-Regular Objectives and Constraints

Reinforcement learning (RL) commonly relies on scalar rewards with limited ability to express temporal, conditional, or safety-critical goals, and can lead to reward hacking. Temporal logic expressible via the more general class of…

Artificial Intelligence · Computer Science 2025-11-26 Dominik Wagner , Leon Witzman , Luke Ong

A Simple Reward-free Approach to Constrained Reinforcement Learning

In constrained reinforcement learning (RL), a learning agent seeks to not only optimize the overall reward but also satisfy the additional safety, diversity, or budget constraints. Consequently, existing constrained RL solutions require…

Machine Learning · Computer Science 2021-07-13 Sobhan Miryoosefi , Chi Jin

SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Natural Language to SQL (NL2SQL) enables intuitive interactions with databases by transforming natural language queries into structured SQL statements. Despite recent advancements in enhancing human-computer interaction within database…

Databases · Computer Science 2025-10-10 Peixian Ma , Xialie Zhuang , Chengjin Xu , Xuhui Jiang , Ran Chen , Jian Guo

Reinforcement Learning With Temporal Logic Rewards

Reinforcement learning (RL) depends critically on the choice of reward functions used to capture the de- sired behavior and constraints of a robot. Usually, these are handcrafted by a expert designer and represent heuristics for relatively…

Artificial Intelligence · Computer Science 2017-03-03 Xiao Li , Cristian-Ioan Vasile , Calin Belta

Constrained Meta Reinforcement Learning with Provable Test-Time Safety

Meta reinforcement learning (RL) allows agents to leverage experience across a distribution of tasks on which the agent can train at will, enabling faster learning of optimal policies on new test tasks. Despite its success in improving…

Machine Learning · Computer Science 2026-05-27 Tingting Ni , Maryam Kamgarpour

Reinforcement Learning to Rank Using Coarse-grained Rewards

Learning to rank (LTR) plays a crucial role in various Information Retrieval (IR) tasks. Although supervised LTR methods based on fine-grained relevance labels (e.g., document-level annotations) have achieved significant success, their…

Information Retrieval · Computer Science 2025-08-21 Yiteng Tu , Zhichao Xu , Tao Yang , Weihang Su , Yujia Zhou , Yiqun Liu , Fen Lin , Qin Liu , Qingyao Ai

Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL

Large Language Models (LLMs) can translate natural language into SQL, but small models struggle with multi-table and complex queries in Zero-Shot Learning (ZSL) settings. While Supervised Fine-Tuning (SFT) helps, it falls short for harder…

Machine Learning · Computer Science 2026-05-05 Simone Papicchio , Simone Rossi , Luca Cagliero , Paolo Papotti

ConfClip: Confidence-Weighted and Clipped Reward for Reinforcement Learning in LLMs

Reinforcement learning (RL) has become a standard paradigm for refining large language models (LLMs) beyond pre-training and instruction tuning. A prominent line of work is RL with verifiable rewards (RLVR), which leverages automatically…

Machine Learning · Computer Science 2025-09-23 Bonan Zhang , Zhongqi Chen , Bowen Song , Qinya Li , Fan Wu , Guihai Chen

Probabilistic Satisfaction of Temporal Logic Constraints in Reinforcement Learning via Adaptive Policy-Switching

Constrained Reinforcement Learning (CRL) is a subset of machine learning that introduces constraints into the traditional reinforcement learning (RL) framework. Unlike conventional RL which aims solely to maximize cumulative rewards, CRL…

Artificial Intelligence · Computer Science 2024-12-02 Xiaoshan Lin , Sadık Bera Yüksel , Yasin Yazıcıoğlu , Derya Aksaray

Teacher Forcing Recovers Reward Functions for Text Generation

Reinforcement learning (RL) has been widely used in text generation to alleviate the exposure bias issue or to utilize non-parallel datasets. The reward function plays an important role in making RL training successful. However, previous…

Machine Learning · Computer Science 2023-01-19 Yongchang Hao , Yuxin Liu , Lili Mou

Reinforcement Learning Agent Training with Goals for Real World Tasks

Reinforcement Learning (RL) is a promising approach for solving various control, optimization, and sequential decision making tasks. However, designing reward functions for complex tasks (e.g., with multiple objectives and safety…

Artificial Intelligence · Computer Science 2021-07-23 Xuan Zhao , Marcos Campos

Reward Constrained Interactive Recommendation with Natural Language Feedback

Text-based interactive recommendation provides richer user feedback and has demonstrated advantages over traditional interactive recommender systems. However, recommendations can easily violate preferences of users from their past…

Computation and Language · Computer Science 2020-05-05 Ruiyi Zhang , Tong Yu , Yilin Shen , Hongxia Jin , Changyou Chen , Lawrence Carin

TableGPT-R1: Advancing Tabular Reasoning Through Reinforcement Learning

Tabular data serves as the backbone of modern data analysis and scientific research. While Large Language Models (LLMs) fine-tuned via Supervised Fine-Tuning (SFT) have significantly improved natural language interaction with such…

Machine Learning · Computer Science 2025-12-29 Saisai Yang , Qingyi Huang , Jing Yuan , Liangyu Zha , Kai Tang , Yuhang Yang , Ning Wang , Yucheng Wei , Liyao Li , Wentao Ye , Hao Chen , Tao Zhang , Junlin Zhou , Haobo Wang , Gang Chen , Junbo Zhao

Robust Constrained Reinforcement Learning

Constrained reinforcement learning is to maximize the expected reward subject to constraints on utilities/costs. However, the training environment may not be the same as the test one, due to, e.g., modeling error, adversarial attack,…

Machine Learning · Computer Science 2022-09-16 Yue Wang , Fei Miao , Shaofeng Zou

Reinforcement Learning Enhanced LLMs: A Survey

Reinforcement learning (RL) enhanced large language models (LLMs), particularly exemplified by DeepSeek-R1, have exhibited outstanding performance. Despite the effectiveness in improving LLM capabilities, its implementation remains highly…

Computation and Language · Computer Science 2025-02-25 Shuhe Wang , Shengyu Zhang , Jie Zhang , Runyi Hu , Xiaoya Li , Tianwei Zhang , Jiwei Li , Fei Wu , Guoyin Wang , Eduard Hovy

Text2Reward: Reward Shaping with Language Models for Reinforcement Learning

Designing reward functions is a longstanding challenge in reinforcement learning (RL); it requires specialized knowledge or domain data, leading to high costs for development. To address this, we introduce Text2Reward, a data-free framework…

Machine Learning · Computer Science 2024-05-28 Tianbao Xie , Siheng Zhao , Chen Henry Wu , Yitao Liu , Qian Luo , Victor Zhong , Yanchao Yang , Tao Yu

An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents

Reinforcement learning (RL) has demonstrated strong potential in training large language models (LLMs) capable of complex reasoning for real-world problem solving. More recently, RL has been leveraged to create sophisticated LLM-based…

Computation and Language · Computer Science 2025-05-22 Bowen Jin , Jinsung Yoon , Priyanka Kargupta , Sercan O. Arik , Jiawei Han

Reshaping Reasoning in LLMs: A Theoretical Analysis of RL Training Dynamics through Pattern Selection

While reinforcement learning (RL) demonstrated remarkable success in enhancing the reasoning capabilities of language models, the training dynamics of RL in LLMs remain unclear. In this work, we provide an explanation of the RL training…

Machine Learning · Computer Science 2025-09-30 Xingwu Chen , Tianle Li , Difan Zou