English

Constrained Thompson Sampling for Wireless Link Optimization

Machine Learning 2020-04-21 v2 Networking and Internet Architecture Machine Learning

Abstract

Wireless communication systems operate in complex time-varying environments. Therefore, selecting the optimal configuration parameters in these systems is a challenging problem. For wireless links, \emph{rate selection} is used to select the optimal data transmission rate that maximizes the link throughput subject to an application-defined latency constraint. We model rate selection as a stochastic multi-armed bandit (MAB) problem, where a finite set of transmission rates are modeled as independent bandit arms. For this setup, we propose Con-TS, a novel constrained version of the Thompson sampling algorithm, where the latency requirement is modeled by a high-probability linear constraint. We show that for Con-TS, the expected number of constraint violations over T transmission intervals is upper bounded by O(\sqrt{KT}), where K is the number of available rates. Further, the expected loss in cumulative throughput compared to the optimal rate selection scheme (i.e., the egret is also upper bounded by O(\sqrt{KT \log K}). Through numerical simulations, we demonstrate that Con-TS significantly outperforms state-of-the-art bandit schemes for rate selection.

Keywords

Cite

@article{arxiv.1902.11102,
  title  = {Constrained Thompson Sampling for Wireless Link Optimization},
  author = {Vidit Saxena and Joseph E. Gonzalez and Ion Stoica and Hugo Tullberg and Joakim Jaldén},
  journal= {arXiv preprint arXiv:1902.11102},
  year   = {2020}
}

Comments

11 pages, 2 figures. Revised version containing theoretical performance bounds

R2 v1 2026-06-23T07:54:15.072Z