中文
相关论文

相关论文: Adaptive Policy Learning Under Unknown Network Int…

200 篇论文

Many interventions, such as vaccines in clinical trials or coupons in online marketplaces, must be assigned sequentially without full knowledge of their effects. Multi-armed bandit algorithms have proven successful in such settings.…

机器学习 · 统计学 2026-05-07 Aidan Gleich , Eric Laber , Alexander Volfovsky

This paper studies adaptive targeting under network interference in a bandit setting, where treatments applied to one individual may affect others through spillover effects. We consider a linear model in a sparse regime, where each…

机器学习 · 统计学 2026-05-28 Xiaomeng Wang , Hamsa Bastani , Osbert Bastani , Zhimei Ren

This paper considers the use of a simple posterior sampling algorithm to balance between exploration and exploitation when learning to optimize actions such as in multi-armed bandit problems. The algorithm, also known as Thompson Sampling,…

机器学习 · 计算机科学 2014-02-04 Daniel Russo , Benjamin Van Roy

We consider the problem of controlling an unknown linear quadratic Gaussian (LQG) system consisting of multiple subsystems connected over a network. Our goal is to minimize and quantify the regret (i.e. loss in performance) of our strategy…

系统与控制 · 电气工程与系统科学 2021-08-19 Sagar Sudhakara , Aditya Mahajan , Ashutosh Nayyar , Yi Ouyang

This paper studies the problem of optimally allocating treatments in the presence of spillover effects, using information from a (quasi-)experiment. I introduce a method that maximizes the sample analog of average social welfare when…

计量经济学 · 经济学 2024-04-09 Davide Viviano

Causal effect estimation in networked systems is central to data-driven decision making. In such settings, interventions on one unit can spill over to others, and in complex physical or social systems, the interaction pathways driving these…

机器学习 · 统计学 2025-11-27 Sadegh Shirani , Mohsen Bayati

We consider settings where an allocation has to be chosen repeatedly, returns are unknown but can be learned, and decisions are subject to constraints. Our model covers two-sided and one-sided matching, even with complex constraints. We…

计量经济学 · 经济学 2020-11-05 Maximilian Kasy , Alexander Teytelboym

Network interference has attracted significant attention in the field of causal inference, encapsulating various sociological behaviors where the treatment assigned to one individual within a network may affect the outcomes of others, such…

机器学习 · 计算机科学 2025-02-11 Zhiheng Zhang , Zichen Wang

Restless bandit problems assume time-varying reward distributions of the arms, which adds flexibility to the model but makes the analysis more challenging. We study learning algorithms over the unknown reward distributions and prove a…

机器学习 · 计算机科学 2019-10-15 Young Hun Jung , Marc Abeille , Ambuj Tewari

Although there is now a large literature on policy evaluation and learning, much of the prior work assumes that the treatment assignment of one unit does not affect the outcome of another unit. Unfortunately, ignoring interference can lead…

统计方法学 · 统计学 2025-04-02 Yi Zhang , Kosuke Imai

Performance of adaptive control policies is assessed through the regret with respect to the optimal regulator, which reflects the increase in the operating cost due to uncertainty about the dynamics parameters. However, available results in…

系统与控制 · 计算机科学 2020-03-24 Mohamad Kazem Shirani Faradonbeh , Ambuj Tewari , George Michailidis

We address online combinatorial optimization when the player has a prior over the adversary's sequence of losses. In this framework, Russo and Van Roy proposed an information-theoretic analysis of Thompson Sampling based on the information…

机器学习 · 计算机科学 2022-04-05 Sébastien Bubeck , Mark Sellke

In model-based solution approaches to the problem of learning in an unknown environment, exploring to learn the model parameters takes a toll on the regret. The optimal performance with respect to regret or PAC bounds is achievable, if the…

机器学习 · 计算机科学 2015-10-13 P. Prasanna , Sarath Chandar , Balaraman Ravindran

Practitioners conducting adaptive experiments often encounter two competing priorities: maximizing total welfare (or `reward') through effective treatment assignment and swiftly concluding experiments to implement population-wide…

机器学习 · 计算机科学 2024-07-31 Chao Qin , Daniel Russo

Online experimentation with interference is a common challenge in modern applications such as e-commerce and adaptive clinical trials in medicine. For example, in online marketplaces, the revenue of a good depends on discounts applied to…

机器学习 · 计算机科学 2024-05-30 Abhineet Agarwal , Anish Agarwal , Lorenzo Masoero , Justin Whitehouse

Thompson sampling has been shown to be an effective policy across a variety of online learning tasks. Many works have analyzed the finite time performance of Thompson sampling, and proved that it achieves a sub-linear regret under a broad…

机器学习 · 计算机科学 2020-11-10 Cem Kalkanli , Ayfer Ozgur

Online machine learning systems need to adapt to domain shifts. Meanwhile, acquiring label at every timestep is expensive. We propose a surprisingly simple algorithm that adaptively balances its regret and its number of label queries in…

机器学习 · 计算机科学 2021-03-01 Yining Chen , Haipeng Luo , Tengyu Ma , Chicheng Zhang

Randomized experiments are widely used to estimate the causal effects of a proposed treatment in many areas of science, from medicine and healthcare to the physical and biological sciences, from the social sciences to engineering, to public…

统计方法学 · 统计学 2022-11-30 Christina Lee Yu , Edoardo M Airoldi , Christian Borgs , Jennifer T Chayes

A common challenge for decision makers is selecting actions whose rewards are unknown and evolve over time based on prior policies. For instance, repeated use may reduce an action's effectiveness (habituation), while inactivity may restore…

机器学习 · 计算机科学 2025-11-06 Fengxu Li , Stephanie M. Carpenter , Matthew P. Buman , Yonatan Mintz

Randomized experiments are the gold standard for estimating treatment effects, yet network interference challenges the validity of traditional estimators by violating the stable unit treatment value assumption and introducing bias. While…

统计方法学 · 统计学 2024-09-02 Xin Lu , Hongzi Li , Hanzhong Liu
‹ 上一页 1 2 3 10 下一页 ›