English

Zero-Order Sharpness-Aware Minimization

Statistics Theory 2025-12-30 v2 Statistics Theory

Abstract

Prompt learning has become a key method for adapting large language models to specific tasks with limited data. However, traditional gradient-based optimization methods for tuning prompts are computationally intensive, posing challenges for efficiency. We introduce ZOSA (Zero-Order Sharpness-Aware Minimization), a novel optimization framework that integrates zero-order optimization with sharpness-aware minimization to enhance prompt tuning. ZOSA employs Rademacher perturbation vectors to estimate gradients without requiring backpropagation. By incorporating sharpness-aware principles, it targets flat minima in the loss landscape, improving generalization. An adaptive learning rate, guided by loss variability, further ensures stable convergence. Experiments on few-shot learning tasks, such as text classification and natural language inference, show that ZOSA significantly outperforms existing methods. With its theoretical foundation and computational efficiency, ZOSA offers a practical solution for prompt-based learning in resource-limited settings.

Keywords

Cite

@article{arxiv.2511.09156,
  title  = {Zero-Order Sharpness-Aware Minimization},
  author = {Yao Fu and Yihang Jin and Chunxia Zhang and Junmin Liu and Guang Dai and Haishan Ye},
  journal= {arXiv preprint arXiv:2511.09156},
  year   = {2025}
}
R2 v1 2026-07-01T07:33:39.969Z