Related papers: Learning Algorithms for Minimizing Queue Length Re…

Minimizing Queue Length Regret for Arbitrarily Varying Channels

We consider an online channel scheduling problem for a single transmitter-receiver pair equipped with $N$ arbitrarily varying wireless channels. The transmission rates of the channels might be non-stationary and could be controlled by an…

Information Theory · Computer Science 2025-01-24 G Krishnakumar , Abhishek Sinha

Learning Unknown Service Rates in Queues: A Multi-Armed Bandit Approach

Consider a queueing system consisting of multiple servers. Jobs arrive over time and enter a queue for service; the goal is to minimize the size of this queue. At each opportunity for service, at most one server can be chosen, and at most…

Systems and Control · Computer Science 2019-11-25 Subhashini Krishnasamy , Rajat Sen , Ramesh Johari , Sanjay Shakkottai

Queue Length Regret Bounds for Contextual Queueing Bandits

We introduce contextual queueing bandits, a new context-aware framework for scheduling while simultaneously learning unknown service rates. Individual jobs carry heterogeneous contextual features, based on which the agent chooses a job and…

Machine Learning · Computer Science 2026-05-19 Seoungbin Bae , Garyeong Kang , Dabeen Lee

An Online Approach to Dynamic Channel Access and Transmission Scheduling

Making judicious channel access and transmission scheduling decisions is essential for improving performance as well as energy and spectral efficiency in multichannel wireless systems. This problem has been a subject of extensive study in…

Machine Learning · Computer Science 2015-04-07 Yang Liu , Mingyan Liu

Aging Bandits: Regret Analysis and Order-Optimal Learning Algorithm for Wireless Networks with Stochastic Arrivals

We consider a single-hop wireless network with sources transmitting time-sensitive information to the destination over multiple unreliable channels. Packets from each source are generated according to a stochastic process with known…

Systems and Control · Electrical Eng. & Systems 2020-12-23 Eray Unsal Atay , Igor Kadota , Eytan Modiano

Individual Regret in Cooperative Nonstochastic Multi-Armed Bandits

We study agents communicating over an underlying network by exchanging messages, in order to optimize their individual regret in a common nonstochastic multi-armed bandit problem. We derive regret minimization algorithms that guarantee for…

Machine Learning · Computer Science 2019-11-19 Yogev Bar-On , Yishay Mansour

Drift Plus Optimistic Penalty: A Learning Framework for Stochastic Network Optimization with Improved Regret Bounds

We consider the problem of joint routing and scheduling in queueing networks, where the edge transmission costs are unknown. At each time-slot, the network controller receives noisy observations of transmission costs only for those edges it…

Networking and Internet Architecture · Computer Science 2025-11-05 Sathwik Chadaga , Eytan Modiano

Learning payoffs while routing in skill-based queues

Motivated by applications in service systems, we consider queueing systems where each customer must be handled by a server with the right skill set. We focus on optimizing the routing of customers to servers in order to maximize the total…

Machine Learning · Computer Science 2024-12-16 Sanne van Kempen , Jaron Sanders , Fiona Sloothaak , Maarten G. Wolf

Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret

The problem of distributed learning and channel access is considered in a cognitive network with multiple secondary users. The availability statistics of the channels are initially unknown to the secondary users and are estimated using…

Networking and Internet Architecture · Computer Science 2016-11-17 Animashree Anandkumar , Nithin Michael , Ao Kevin Tang , Ananthram Swami

Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication

We study the problem of regret minimization for distributed bandits learning, in which $M$ agents work collaboratively to minimize their total regret under the coordination of a central server. Our goal is to design communication protocols…

Machine Learning · Computer Science 2019-05-30 Yuanhao Wang , Jiachen Hu , Xiaoyu Chen , Liwei Wang

Regret-Minimization Algorithms for Multi-Agent Cooperative Learning Systems

A Multi-Agent Cooperative Learning (MACL) system is an artificial intelligence (AI) system where multiple learning agents work together to complete a common task. Recent empirical success of MACL systems in various domains (e.g. traffic…

Machine Learning · Computer Science 2023-10-31 Jialin Yi

Decentralized Learning in Online Queuing Systems

Motivated by packet routing in computer networks, online queuing systems are composed of queues receiving packets at different rates. Repeatedly, they send packets to servers, each of them treating only at most one packet at a time. In the…

Machine Learning · Statistics 2021-11-05 Flore Sentenac , Etienne Boursier , Vianney Perchet

A Sharp Memory-Regret Trade-Off for Multi-Pass Streaming Bandits

The stochastic $K$-armed bandit problem has been studied extensively due to its applications in various domains ranging from online advertising to clinical trials. In practice however, the number of arms can be very large resulting in large…

Machine Learning · Computer Science 2022-05-03 Arpit Agarwal , Sanjeev Khanna , Prathamesh Patil

Learning to Cache and Caching to Learn: Regret Analysis of Caching Algorithms

Crucial performance metrics of a caching algorithm include its ability to quickly and accurately learn a popularity distribution of requests. However, a majority of work on analytical performance analysis focuses on hit probability after an…

Networking and Internet Architecture · Computer Science 2020-04-02 Archana Bura , Desik Rengarajan , Dileep Kalathil , Srinivas Shakkottai , Jean-Francois Chamberland-Tremblay

Sequential Transfer in Multi-armed Bandit with Finite Set of Models

Learning from prior tasks and transferring that experience to improve future performance is critical for building lifelong learning agents. Although results in supervised and reinforcement learning show that transfer may significantly…

Machine Learning · Statistics 2013-07-29 Mohammad Gheshlaghi Azar , Alessandro Lazaric , Emma Brunskill

Learning-based Optimal Admission Control in a Single Server Queuing System

We consider a long-term average profit maximizing admission control problem in an M/M/1 queuing system with unknown service and arrival rates. With a fixed reward collected upon service completion and a cost per unit of time enforced on…

Optimization and Control · Mathematics 2023-11-27 Asaf Cohen , Vijay G. Subramanian , Yili Zhang

Regret Bounds for Batched Bandits

We present simple and efficient algorithms for the batched stochastic multi-armed bandit and batched stochastic linear bandit problems. We prove bounds for their expected regrets that improve over the best-known regret bounds for any number…

Data Structures and Algorithms · Computer Science 2020-02-19 Hossein Esfandiari , Amin Karbasi , Abbas Mehrabian , Vahab Mirrokni

Distributed Linear Bandits under Communication Constraints

We consider distributed linear bandits where $M$ agents learn collaboratively to minimize the overall cumulative regret incurred by all agents. Information exchange is facilitated by a central server, and both the uplink and downlink…

Machine Learning · Computer Science 2025-11-17 Sudeep Salgia , Qing Zhao

Regret Balancing for Bandit and RL Model Selection

We consider model selection in stochastic bandit and reinforcement learning problems. Given a set of base learning algorithms, an effective model selection strategy adapts to the best learning algorithm in an online fashion. We show that by…

Machine Learning · Computer Science 2020-06-11 Yasin Abbasi-Yadkori , Aldo Pacchiano , My Phan

Near-Optimal Regret-Queue Length Tradeoff in Online Learning for Two-Sided Markets

We study a two-sided market, wherein, price-sensitive heterogeneous customers and servers arrive and join their respective queues. A compatible customer-server pair can then be matched by the platform, at which point, they leave the system.…

Machine Learning · Computer Science 2025-10-17 Zixian Yang , Sushil Mahavir Varma , Lei Ying