最优化与控制
We study unconstrained optimization problems of nonsmooth, nonconvex Lipschitz functions, using only noisy pairwise comparisons governed by a known link function. Our goal is to compute a $(\delta,\varepsilon)$-Goldstein stationary point.…
Policy-gradient methods are widely used in reinforcement learning, yet training often becomes unstable or slows down as learning progresses. We study this phenomenon through the noise-to-signal ratio (NSR) of a policy-gradient estimator,…
This work proposes A$^2$GD, a novel adaptive accelerated gradient descent method for convex and composite optimization. Smoothness and convexity constants are updated via Lyapunov analysis. Inspired by stability analysis in ODE solvers, the…
In convex geometry, the Shapley-Folkman Lemma asserts that the nonconvexity of a Minkowski sum of $n$ dimensional bounded nonconvex sets does not accumulate once the number of summands exceeds the dimension $n$, and thus the sum becomes…
A recent breakthrough in nonconvex optimization is the online-to-nonconvex conversion framework of [Cutkosky et al., 2023], which reformulates the task of finding an $\varepsilon$-first-order stationary point as an online learning problem.…
This paper develops a novel COllaborative-Online-Learning (COOL)-enabled motion control framework for multi-robot systems to avoid collision amid randomly moving obstacles whose motion distributions are partially observable through…
The rise of parallel computing hardware has made it increasingly important to understand which nonlinear state space models can be efficiently parallelized. Recent advances like DEER (arXiv:2309.12252) and DeepPCR (arXiv:2309.16318) recast…
We present a graphon mean-field logit dynamic, a stationary mean-field game based on logit interactions. This dynamic emerges from a stochastic control problem involving a continuum of nonexchangeable and interacting agents and reduces to…
Sample average approximation--based stochastic dynamic programming (SDP) and model predictive control (MPC) are two different methods for approaching multistage stochastic optimization. In this paper we investigate the conditions under…
Circumcentered techniques have been shown to significantly accelerate projection-based methods for convex feasibility problems. Motivated by this success, we propose two direct methods with circumcenter acceleration for solving variational…
We propose a class of discrete state sampling algorithms based on Nesterov's accelerated gradient method, which extends the classical Metropolis-Hastings (MH) algorithm. The evolution of the discrete states probability distribution governed…
This paper presents a novel approach for optimizing the scheduling and control of Pan-Tilt-Zoom (PTZ) cameras in dynamic surveillance environments. The proposed method integrates Kalman filters for motion prediction with a dynamic network…
While there is an extensive body of research on the analysis of Value Iteration (VI) for discounted cumulative-reward MDPs, prior work on analyzing VI for (undiscounted) average-reward MDPs has been limited, and most prior results focus on…
Two nested classes of discrete-time linear time-invariant systems, which differ by the set of periodic signals that they leave invariant, are studied. The first class preserves the property of periodic monotonicity (period-wise…
This paper addresses the exact controllability of trajectories in the one-dimensional Fisher-Stefan problem--a reaction-diffusion equation that models the spatial propagation of biological, chemical, or physical populations within a…
Given a max-plus linear system and a semimodule, the problem of computing the maximal controlled invariant subsemimodule is still open to this day. In this paper, we consider this problem for the specific class of fully actuated systems and…
Handling model mismatch is a common challenge in model predictive control (MPC). While robust MPC is effective, its conservatism often makes it less desirable. Certainty-equivalence MPC (CE-MPC), which uses a nominal model, offers an…
The property of linear discrete-time time-invariant system operators mapping inputs with at most $k-1$ sign changes to outputs with at most $k-1$ sign changes is investigated. We show that this property is tractable via the notion of…
This paper considers the robust phase retrieval, which can be cast as a nonsmooth and nonconvex composite optimization problem. We propose two first-order algorithms with adaptive step sizes: the subgradient algorithm (AdaSubGrad) and the…
We propose a robust Q-learning algorithm for Markov decision processes under model uncertainty when each state-action pair is associated with a finite ambiguity set of candidate transition kernels. This finite-measure framework enables…