English
Related papers

Related papers: DYNAMIX: RL-based Adaptive Batch Size Optimization…

200 papers

In distributed optimization, the practical problem-solving performance is essentially sensitive to algorithm selection, parameter setting, problem type and data pattern. Thus, it is often laborious to acquire a highly efficient method for a…

Optimization and Control · Mathematics 2024-01-04 Daokuan Zhu , Tianqi Xu , Jie Lu

Dynamic distribution network reconfiguration (DNR) algorithms perform hourly status changes of remotely controllable switches to improve distribution system performance. The problem is typically solved by physical model-based control…

Systems and Control · Electrical Eng. & Systems 2020-06-24 Yuanqi Gao , Wei Wang , Jie Shi , Nanpeng Yu

In many real-world scenarios, Reinforcement Learning (RL) algorithms are trained on data with dynamics shift, i.e., with different underlying environment dynamics. A majority of current methods address such issue by training context…

Machine Learning · Computer Science 2024-02-23 Zhenghai Xue , Qingpeng Cai , Shuchang Liu , Dong Zheng , Peng Jiang , Kun Gai , Bo An

Reinforcement learning (RL) struggles to scale to large, combinatorial action spaces common in many real-world problems. This paper introduces a novel framework for training discrete diffusion models as highly effective policies in these…

Machine Learning · Computer Science 2026-05-21 Haitong Ma , Ofir Nabati , Aviv Rosenberg , Bo Dai , Oran Lang , Craig Boutilier , Na Li , Shie Mannor , Lior Shani , Guy Tenneholtz

Decision-making under distribution shift is a central challenge in reinforcement learning (RL), where training and deployment environments differ. We study this problem through the lens of robust Markov decision processes (RMDPs), which…

Machine Learning · Computer Science 2025-10-17 Jingwen Gu , Yiting He , Zhishuai Liu , Pan Xu

Instability and slowness are two main problems in deep reinforcement learning. Even if proximal policy optimization (PPO) is the state of the art, it still suffers from these two problems. We introduce an improved algorithm based on…

Machine Learning · Computer Science 2019-10-01 Zhenyu Zhang , Xiangfeng Luo , Tong Liu , Shaorong Xie , Jianshu Wang , Wei Wang , Yang Li , Yan Peng

This work studies reinforcement learning (RL) in the context of multi-period supply chains subject to constraints, e.g., on production and inventory. We introduce Distributional Constrained Policy Optimization (DCPO), a novel approach for…

Machine Learning · Computer Science 2023-02-06 Jaime Sabal Bermúdez , Antonio del Rio Chanona , Calvin Tsay

Multi-dimensional data streams, prevalent in applications like IoT, financial markets, and real-time analytics, pose significant challenges due to their high velocity, unbounded nature, and complex inter-dimensional dependencies. Sliding…

Machine Learning · Computer Science 2025-07-10 Abolfazl Zarghani , Sadegh Abedi

In dynamic programming (DP) and reinforcement learning (RL), an agent learns to act optimally in terms of expected long-term return by sequentially interacting with its environment modeled by a Markov decision process (MDP). More generally…

Machine Learning · Computer Science 2022-01-03 Mastane Achab , Gergely Neu

Reinforcement learning (RL) is currently one of the most prominent methods for optimizing dynamical systems, with breakthrough results across various fields. The framework is based on the concept of a Markov decision process (MDP), leading…

Optimization and Control · Mathematics 2025-11-17 Rene Carmona , Mathieu Lauriere

Increasingly large imitation learning datasets are being collected with the goal of training foundation models for robotics. However, despite the fact that data selection has been of utmost importance in vision and natural language…

Robotics · Computer Science 2025-02-24 Joey Hejna , Chethan Bhateja , Yichen Jiang , Karl Pertsch , Dorsa Sadigh

This paper presents Post-Decision Proximal Policy Optimization (PDPPO), a novel variation of the leading deep reinforcement learning method, Proximal Policy Optimization (PPO). The PDPPO state transition process is divided into two steps: a…

Distributed training in deep learning (DL) is common practice as data and models grow. The current practice for distributed training of deep neural networks faces the challenges of communication bottlenecks when operating at scale, and…

Machine Learning · Computer Science 2020-12-21 Shubhankar Gahlot , Junqi Yin , Mallikarjun Shankar

The exponential growth of data-intensive applications has placed unprecedented demands on modern storage systems, necessitating dynamic and efficient optimization strategies. Traditional heuristics employed for storage performance…

Operating Systems · Computer Science 2025-08-25 Chiyu Cheng , Chang Zhou , Yang Zhao

Preference-based reinforcement learning (RL) is a key paradigm for aligning policies with human judgments, yet its theoretical behavior in distributed settings where preference data are fragmented across heterogeneous users remains poorly…

Machine Learning · Computer Science 2026-05-21 Zhanhong Jiang

Federated Learning (FL) is a distributed learning paradigm that can coordinate heterogeneous edge devices to perform model training without sharing private data. While prior works have focused on analyzing FL convergence with respect to…

Machine Learning · Computer Science 2025-09-09 Weijie Liu , Xiaoxi Zhang , Jingpu Duan , Carlee Joe-Wong , Zhi Zhou , Xu Chen

Synchronous strategies with data parallelism, such as the Synchronous StochasticGradient Descent (S-SGD) and the model averaging methods, are widely utilizedin distributed training of Deep Neural Networks (DNNs), largely owing to itseasy…

Machine Learning · Computer Science 2022-11-04 Qing Ye , Yuhao Zhou , Mingjia Shi , Yanan Sun , Jiancheng Lv

Policy-based algorithms are among the most widely adopted techniques in model-free RL, thanks to their strong theoretical groundings and good properties in continuous action spaces. Unfortunately, these methods require precise and…

Machine Learning · Computer Science 2023-06-14 Luca Sabbioni , Francesco Corda , Marcello Restelli

Dynamic decisions are pivotal to economic policy making. We show how existing evidence from randomized control trials can be utilized to guide personalized decisions in challenging dynamic environments with budget and capacity constraints.…

Econometrics · Economics 2024-11-26 Karun Adusumilli , Friedrich Geiecke , Claudio Schilter

Distributionally robust offline reinforcement learning (RL), which seeks robust policy training against environment perturbation by modeling dynamics uncertainty, calls for function approximations when facing large state-action spaces.…

Machine Learning · Computer Science 2025-11-03 Zhishuai Liu , Pan Xu
‹ Prev 1 2 3 10 Next ›