Ruichen Jiang — Scifaro

Adaptive Matrix Online Learning through Smoothing with Guarantees for Nonsmooth Nonconvex Optimization

We study online linear optimization with matrix variables constrained by the operator norm, a setting where the geometry renders designing data-dependent and efficient adaptive algorithms challenging. The best-known adaptive regret bounds…

Optimization and Control · Mathematics 2026-02-10 Ruichen Jiang , Zakaria Mhammedi , Mehryar Mohri , Aryan Mokhtari

Improving Online-to-Nonconvex Conversion for Smooth Optimization via Double Optimism

A recent breakthrough in nonconvex optimization is the online-to-nonconvex conversion framework of [Cutkosky et al., 2023], which reformulates the task of finding an $\varepsilon$-first-order stationary point as an online learning problem.…

Optimization and Control · Mathematics 2026-02-10 Francisco Patitucci , Ruichen Jiang , Aryan Mokhtari

On the Complexity of Finding Stationary Points in Nonconvex Simple Bilevel Optimization

In this paper, we study the problem of solving a simple bilevel optimization problem, where the upper-level objective is minimized over the solution set of the lower-level problem. We focus on the general setting in which both the upper-…

Optimization and Control · Mathematics 2025-08-01 Jincheng Cao , Ruichen Jiang , Erfan Yazdandoost Hamedani , Aryan Mokhtari

Non-asymptotic Global Convergence Rates of BFGS with Exact Line Search

In this paper, we explore the non-asymptotic global convergence rates of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method implemented with exact line search. Notably, due to Dixon's equivalence result, our findings are also applicable to…

Optimization and Control · Mathematics 2025-07-16 Qiujiang Jin , Ruichen Jiang , Aryan Mokhtari

Online Learning-guided Learning Rate Adaptation via Gradient Alignment

The performance of an optimizer on large-scale deep learning models depends critically on fine-tuning the learning rate, often requiring an extensive grid search over base learning rates, schedules, and other hyperparameters. In this paper,…

Machine Learning · Computer Science 2025-06-11 Ruichen Jiang , Ali Kavis , Aryan Mokhtari

Provable Complexity Improvement of AdaGrad over SGD: Upper and Lower Bounds in Stochastic Non-Convex Optimization

Adaptive gradient methods, such as AdaGrad, are among the most successful optimization algorithms for neural network training. While these methods are known to achieve better dimensional dependence than stochastic gradient descent (SGD) for…

Optimization and Control · Mathematics 2025-06-09 Ruichen Jiang , Devyani Maladkar , Aryan Mokhtari

Non-asymptotic Global Convergence Analysis of BFGS with the Armijo-Wolfe Line Search

In this paper, we present the first explicit and non-asymptotic global convergence rates of the BFGS method when implemented with an inexact line search scheme satisfying the Armijo-Wolfe conditions. We show that BFGS achieves a global…

Optimization and Control · Mathematics 2025-01-09 Qiujiang Jin , Ruichen Jiang , Aryan Mokhtari

Improved Complexity for Smooth Nonconvex Optimization: A Two-Level Online Learning Approach with Quasi-Newton Methods

We study the problem of finding an $\epsilon$-first-order stationary point (FOSP) of a smooth function, given access only to gradient information. The best-known gradient query complexity for this task, assuming both the gradient and…

Optimization and Control · Mathematics 2024-12-04 Ruichen Jiang , Aryan Mokhtari , Francisco Patitucci

Adaptive and Optimal Second-order Optimistic Methods for Minimax Optimization

We propose adaptive, line search-free second-order methods with optimal rate of convergence for solving convex-concave min-max problems. By means of an adaptive step size, our algorithms feature a simple update rule that requires solving…

Optimization and Control · Mathematics 2024-11-12 Ruichen Jiang , Ali Kavis , Qiujiang Jin , Sujay Sanghavi , Aryan Mokhtari

Stochastic Newton Proximal Extragradient Method

Stochastic second-order methods achieve fast local convergence in strongly convex optimization by using noisy Hessian estimates to precondition the gradient. However, these methods typically reach superlinear convergence only when the…

Optimization and Control · Mathematics 2024-11-12 Ruichen Jiang , Michał Dereziński , Aryan Mokhtari

Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence

In this paper, we propose a quasi-Newton method for solving smooth and monotone nonlinear equations, including unconstrained minimization and minimax optimization as special cases. For the strongly monotone setting, we establish two global…

Optimization and Control · Mathematics 2024-10-04 Ruichen Jiang , Aryan Mokhtari

An Accelerated Gradient Method for Convex Smooth Simple Bilevel Optimization

In this paper, we focus on simple bilevel optimization problems, where we minimize a convex smooth objective function over the optimal solution set of another convex smooth constrained optimization problem. We present a novel bilevel…

Optimization and Control · Mathematics 2024-06-03 Jincheng Cao , Ruichen Jiang , Erfan Yazdandoost Hamedani , Aryan Mokhtari

An Inexact Conditional Gradient Method for Constrained Bilevel Optimization

Bilevel optimization is an important class of optimization problems where one optimization problem is nested within another. While various methods have emerged to address unconstrained general bilevel optimization problems, there has been a…

Optimization and Control · Mathematics 2024-03-15 Nazanin Abolfazli , Ruichen Jiang , Aryan Mokhtari , Erfan Yazdandoost Hamedani

Generalized Optimistic Methods for Convex-Concave Saddle Point Problems

The optimistic gradient method has seen increasing popularity for solving convex-concave saddle point problems. To analyze its iteration complexity, a recent work [arXiv:1906.01115] proposed an interesting perspective that interprets this…

Optimization and Control · Mathematics 2024-01-11 Ruichen Jiang , Aryan Mokhtari

Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate

Second-order optimization methods, such as cubic regularized Newton methods, are known for their rapid convergence rates; nevertheless, they become impractical in high-dimensional problems due to their substantial memory requirements and…

Optimization and Control · Mathematics 2024-01-09 Ruichen Jiang , Parameswaran Raman , Shoham Sabach , Aryan Mokhtari , Mingyi Hong , Volkan Cevher

Projection-Free Methods for Stochastic Simple Bilevel Optimization with Convex Lower-level Problem

In this paper, we study a class of stochastic bilevel optimization problems, also known as stochastic simple bilevel optimization, where we minimize a smooth stochastic objective function over the optimal solution set of another stochastic…

Optimization and Control · Mathematics 2023-08-16 Jincheng Cao , Ruichen Jiang , Nazanin Abolfazli , Erfan Yazdandoost Hamedani , Aryan Mokhtari

Online Learning Guided Curvature Approximation: A Quasi-Newton Method with Global Non-Asymptotic Superlinear Convergence

Quasi-Newton algorithms are among the most popular iterative methods for solving unconstrained minimization problems, largely due to their favorable superlinear convergence property. However, existing results for these algorithms are…

Optimization and Control · Mathematics 2023-07-26 Ruichen Jiang , Qiujiang Jin , Aryan Mokhtari

Accelerated Quasi-Newton Proximal Extragradient: Faster Rate for Smooth Convex Optimization

In this paper, we propose an accelerated quasi-Newton proximal extragradient (A-QPNE) method for solving unconstrained smooth convex optimization problems. With access only to the gradients of the objective, we prove that our method can…

Optimization and Control · Mathematics 2023-06-06 Ruichen Jiang , Aryan Mokhtari

A Conditional Gradient-based Method for Simple Bilevel Optimization with Convex Lower-level Problem

In this paper, we study a class of bilevel optimization problems, also known as simple bilevel optimization, where we minimize a smooth objective function over the optimal solution set of another convex constrained optimization problem.…

Optimization and Control · Mathematics 2023-04-25 Ruichen Jiang , Nazanin Abolfazli , Aryan Mokhtari , Erfan Yazdandoost Hamedani

Future Gradient Descent for Adapting the Temporal Shifting Data Distribution in Online Recommendation Systems

One of the key challenges of learning an online recommendation model is the temporal domain shift, which causes the mismatch between the training and testing data distribution and hence domain generalization error. To overcome, we propose…

Machine Learning · Computer Science 2022-09-05 Mao Ye , Ruichen Jiang , Haoxiang Wang , Dhruv Choudhary , Xiaocong Du , Bhargav Bhushanam , Aryan Mokhtari , Arun Kejariwal , Qiang Liu