Related papers: Deep Policy Iteration with Integer Programming for…

Adaptive Inventory Strategies using Deep Reinforcement Learning for Dynamic Agri-Food Supply Chains

Agricultural products are often subject to seasonal fluctuations in production and demand. Predicting and managing inventory levels in response to these variations can be challenging, leading to either excess inventory or stockouts.…

Artificial Intelligence · Computer Science 2025-07-23 Amandeep Kaur , Gyan Prakash

Program Machine Policy: Addressing Long-Horizon Tasks by Integrating Program Synthesis and State Machines

Deep reinforcement learning (deep RL) excels in various domains but lacks generalizability and interpretability. On the other hand, programmatic RL methods (Trivedi et al., 2021; Liu et al., 2023) reformulate RL tasks as synthesizing…

Machine Learning · Computer Science 2024-02-12 Yu-An Lin , Chen-Tao Lee , Guan-Ting Liu , Pu-Jen Cheng , Shao-Hua Sun

Structure-Informed Deep Reinforcement Learning for Inventory Management

This paper investigates the application of Deep Reinforcement Learning (DRL) to classical inventory management problems, with a focus on practical implementation considerations. We apply a DRL algorithm based on DirectBackprop to several…

Machine Learning · Computer Science 2025-07-30 Alvaro Maggiar , Sohrab Andaz , Akhil Bagaria , Carson Eisenach , Dean Foster , Omer Gottesman , Dominique Perrault-Joncas

Deep RL Dual Sourcing Inventory Management with Supply and Capacity Risk Awareness

In this work, we study how to efficiently apply reinforcement learning (RL) for solving large-scale stochastic optimization problems by leveraging intervention models. The key of the proposed methodology is to better explore the solution…

Machine Learning · Computer Science 2026-01-13 Defeng Liu , Ying Liu , Carson Eisenach

DeepStock: Reinforcement Learning with Policy Regularizations for Inventory Management

Deep Reinforcement Learning (DRL) provides a general-purpose methodology for training inventory policies that can leverage big data and compute. However, off-the-shelf implementations of DRL have seen mixed success, often plagued by high…

Machine Learning · Computer Science 2026-03-23 Yaqi Xie , Xinru Hao , Jiaxi Liu , Will Ma , Linwei Xin , Lei Cao , Yidong Zhang

Deep Reinforcement Learning for Inventory Networks: Toward Reliable Policy Optimization

We argue that inventory management presents unique opportunities for the reliable application of deep reinforcement learning (DRL). To enable this, we emphasize and test two complementary techniques. The first is Hindsight Differentiable…

Machine Learning · Computer Science 2025-09-12 Matias Alvo , Daniel Russo , Yash Kanoria , Minuk Lee

Learning Large Neighborhood Search Policy for Integer Programming

We propose a deep reinforcement learning (RL) method to learn large neighborhood search (LNS) policy for integer programming (IP). The RL policy is trained as the destroy operator to select a subset of variables at each step, which is…

Artificial Intelligence · Computer Science 2021-11-08 Yaoxin Wu , Wen Song , Zhiguang Cao , Jie Zhang

Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization

In Reinforcement Learning (RL), agents have no incentive to exhibit predictable behaviors, and are often pushed (through e.g. policy entropy regularisation) to randomise their actions in favor of exploration. This often makes it challenging…

Machine Learning · Computer Science 2025-06-04 Daniel Jarne Ornia , Giannis Delimpaltadakis , Jens Kober , Javier Alonso-Mora

Deep Controlled Learning for Inventory Control

The application of Deep Reinforcement Learning (DRL) to inventory management is an emerging field. However, traditional DRL algorithms, originally developed for diverse domains such as game-playing and robotics, may not be well-suited for…

Machine Learning · Computer Science 2025-06-04 Tarkan Temizöz , Christina Imdahl , Remco Dijkman , Douniel Lamghari-Idrissi , Willem van Jaarsveld

Learning Vehicle Routing Problems using Policy Optimisation

Deep reinforcement learning (DRL) has been used to learn effective heuristics for solving complex combinatorial optimisation problem via policy networks and have demonstrated promising performance. Existing works have focused on solving…

Machine Learning · Computer Science 2020-12-25 Nasrin Sultana , Jeffrey Chan , A. K. Qin , Tabinda Sarwar

Iterative Multi-Agent Reinforcement Learning: A Novel Approach Toward Real-World Multi-Echelon Inventory Optimization

Multi-echelon inventory optimization (MEIO) is critical for effective supply chain management, but its inherent complexity can pose significant challenges. Heuristics are commonly used to address this complexity, yet they often face…

Machine Learning · Computer Science 2025-03-25 Georg Ziegner , Michael Choi , Hung Mac Chan Le , Sahil Sakhuja , Arash Sarmadi

A Rollout-Based Algorithm and Reward Function for Resource Allocation in Business Processes

Resource allocation plays a critical role in minimizing cycle time and improving the efficiency of business processes. Recently, Deep Reinforcement Learning (DRL) has emerged as a powerful technique to optimize resource allocation policies…

Machine Learning · Computer Science 2025-09-03 Jeroen Middelhuis , Zaharah Bukhsh , Ivo Adan , Remco Dijkman

Improving RL Exploration for LLM Reasoning through Retrospective Replay

Reinforcement learning (RL) has increasingly become a pivotal technique in the post-training of large language models (LLMs). The effective exploration of the output space is essential for the success of RL. We observe that for complex…

Machine Learning · Computer Science 2025-07-08 Shihan Dou , Muling Wu , Jingwen Xu , Rui Zheng , Tao Gui , Qi Zhang , Xuanjing Huang

Relative Entropy Regularized Policy Iteration

We present an off-policy actor-critic algorithm for Reinforcement Learning (RL) that combines ideas from gradient-free optimization via stochastic search with learned action-value function. The result is a simple procedure consisting of…

Machine Learning · Computer Science 2018-12-07 Abbas Abdolmaleki , Jost Tobias Springenberg , Jonas Degrave , Steven Bohez , Yuval Tassa , Dan Belov , Nicolas Heess , Martin Riedmiller

Deep Reinforcement Learning for Solving Management Problems: Towards A Large Management Mode

We introduce a deep reinforcement learning (DRL) approach for solving management problems including inventory management, dynamic pricing, and recommendation. This DRL approach has the potential to lead to a large management model based on…

Artificial Intelligence · Computer Science 2024-03-04 Jinyang Jiang , Xiaotian Liu , Tao Ren , Qinghao Wang , Yi Zheng , Yufu Du , Yijie Peng , Cheng Zhang

Classical and Deep Reinforcement Learning Inventory Control Policies for Pharmaceutical Supply Chains with Perishability and Non-Stationarity

We study inventory control policies for pharmaceutical supply chains, addressing challenges such as perishability, yield uncertainty, and non-stationary demand, combined with batching constraints, lead times, and lost sales. Collaborating…

Artificial Intelligence · Computer Science 2025-01-22 Francesco Stranieri , Chaaben Kouki , Willem van Jaarsveld , Fabio Stella

Hybrid Information-driven Multi-agent Reinforcement Learning

Information theoretic sensor management approaches are an ideal solution to state estimation problems when considering the optimal control of multi-agent systems, however they are too computationally intensive for large state spaces,…

Multiagent Systems · Computer Science 2021-02-02 William A. Dawson , Ruben Glatt , Edward Rusu , Braden C. Soper , Ryan A. Goldhahn

Programmatically Interpretable Reinforcement Learning

We present a reinforcement learning framework, called Programmatically Interpretable Reinforcement Learning (PIRL), that is designed to generate interpretable and verifiable agent policies. Unlike the popular Deep Reinforcement Learning…

Machine Learning · Computer Science 2019-04-11 Abhinav Verma , Vijayaraghavan Murali , Rishabh Singh , Pushmeet Kohli , Swarat Chaudhuri

Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models

We present a data-efficient framework for solving sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable generative models. The framework, called GenRL, trains deep policies by…

Machine Learning · Computer Science 2022-04-20 Ali Ghadirzadeh , Petra Poklukar , Karol Arndt , Chelsea Finn , Ville Kyrki , Danica Kragic , Mårten Björkman

PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback

We present a novel unified bilevel optimization-based framework, \textsf{PARL}, formulated to address the recently highlighted critical issue of policy alignment in reinforcement learning using utility or preference-based feedback. We…

Machine Learning · Computer Science 2024-05-02 Souradip Chakraborty , Amrit Singh Bedi , Alec Koppel , Dinesh Manocha , Huazheng Wang , Mengdi Wang , Furong Huang