Related papers: Dynamic Optimization and Learning for Renewal Syst…
This paper considers optimization over multiple renewal systems coupled by time average constraints. These systems act asynchronously over variable length frames. For each system, at the beginning of each renewal frame, it chooses an action…
This paper considers an opportunistic scheduling problem over a renewal system. A controller observes a random event at the beginning of each renewal frame and then chooses an action in response to the event, which affects the duration of…
A renewal system divides the slotted timeline into back to back time periods called renewal frames. At the beginning of each frame, it chooses a policy from a set of options for that frame. The policy determines the duration of the frame,…
This paper considers online optimization for a system that performs a sequence of back-to-back tasks. Each task can be processed in one of multiple processing modes that affect the duration of the task, the reward earned, and an additional…
This paper considers time-average optimization, where a decision vector is chosen every time step within a (possibly non-convex) set, and the goal is to minimize a convex function of the time averages subject to convex constraints on these…
We develop an optimization framework centered around a core idea: once a (parametric) policy is specified, control authority is transferred to the policy, resulting in an autonomous dynamical system. Thus we should be able to optimize…
This paper considers online optimization of a renewal-reward system. A controller performs a sequence of tasks back-to-back. Each task has a random vector of parameters, called the task type vector, that affects the task processing options…
Evolutionary algorithms, such as Differential Evolution, excel in solving real-parameter optimization challenges. However, the effectiveness of a single algorithm varies across different problem instances, necessitating considerable efforts…
Optimization plays a central role in intelligent systems and cyber-physical technologies, where speed and reliability of convergence directly impact performance. In control theory, optimization-centric methods are standard: controllers are…
Finding optimal policies which maximize long term rewards of Markov Decision Processes requires the use of dynamic programming and backward induction to solve the Bellman optimality equation. However, many real-world problems require…
In this paper, a distributed optimization problem with general differentiable convex objective functions is studied for single-integrator and double-integrator multi-agent systems. Two distributed adaptive optimization algorithm is…
Determining the most appropriate features for machine learning predictive models is challenging regarding performance and feature acquisition costs. In particular, global feature choice is limited given that some features will only benefit…
This paper considers the problem of designing a dynamical system to solve constrained optimization problems in a distributed way and in an anytime fashion (i.e., such that the feasible set is forward invariant). For problems with separable…
Constructing an accurate system model for formal model verification can be both resource demanding and time-consuming. To alleviate this shortcoming, algorithms have been proposed for automatically learning system models based on observed…
We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, i.e.,~policies that do not take the agent to undesirable situations. We formulate…
We consider the classical problem of sequential resource allocation where a decision maker must repeatedly divide a budget between several resources, each with diminishing returns. This can be recast as a specific stochastic optimization…
In many engineered systems, optimization is used for decision making at time-scales ranging from real-time operation to long-term planning. This process often involves solving similar optimization problems over and over again with slightly…
We consider the problem of dynamic buying and selling of shares from a collection of $N$ stocks with random price fluctuations. To limit investment risk, we place an upper bound on the total number of shares kept at any time. Assuming that…
In current model-free reinforcement learning (RL) algorithms, stability criteria based on sampling methods are commonly utilized to guide policy optimization. However, these criteria only guarantee the infinite-time convergence of the…
This paper considers time-average stochastic optimization, where a time average decision vector, an average of decision vectors chosen in every time step from a time-varying (possibly non-convex) set, minimizes a convex objective function…