Benjamin Heymann
We study the problem of learning to bid when the bidder's value is dynamic, i.e., when the current value depends on past outcomes. Specifically, we consider a bidder participating in repeated second-price auctions whose value depends on the…
LLM recommendation agents increasingly produce structured recommendation reports: sets of items accompanied by natural-language justifications. Yet existing evaluations often reduce this setting to reranking small shortlisted candidate sets…
Semivalue-based data valuation uses cooperative-game theory intuitions to assign each data point a value reflecting its contribution to a downstream task. Still, those values depend on the practitioner's choice of utility, raising the…
Data is a critical asset for training large language models (LLMs), alongside compute resources and skilled workers. While some training data is publicly available, substantial investment is required to generate proprietary datasets, such…
We model a procurement scenario in which two \textit{imperfect} bidders act simultaneously on behalf of a single buyer, a configuration common in display advertising and referred to as \textit{side-by-side bidding} but largely unexplored in…
In production systems, contextual bandit approaches often rely on direct reward models that take both action and context as input. However, these models can suffer from confounding, making it difficult to isolate the effect of the action…
Unobserved confounding arises when an unmeasured feature influences both the treatment and the outcome, leading to biased causal effect estimates. This issue undermines observational studies in fields like economics, medicine, ecology or…
We consider large-scale systems influenced by burnout variables - state variables that start active, shape dynamics, and irreversibly deactivate once certain conditions are met. Simulating what-if scenarios in such systems is…
We consider the problem of directly optimizing a non-linear function of an outcome, where this outcome itself is the sum of many small contributions. The non-linearity of the function means that the problem is not equivalent to the…
When learning to play an imperfect information game, it is often easier to first start with the basic mechanics of the game rules. For example, one can play several example rounds with private cards revealed to all players to better…
AI alignment, the challenge of ensuring AI systems act in accordance with human values, has emerged as a critical problem in the development of systems such as foundation models and recommender systems. Still, the current dominant approach,…
We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain, to some relevant pre-defined utility of a machine learning task, of aggregating an individual dataset to others. The Shapley value is a…
We present a recommender system based on the Random Utility Model. Online shoppers are modeled as rational decision makers with limited information, and the recommendation task is formulated as the problem of optimally enriching the…
Online advertising banners are sold in real-time through auctions.Typically, the more banners a user is shown, the smaller the marginalvalue of the next banner for this user is. This fact can be detected bybasic ML models, that can be used…
Optimally solving a multi-armed bandit problem suffers the curse of dimensionality. Indeed, resorting to dynamic programming leads to an exponential growth of computing time, as the number of arms and the horizon increase. We introduce a…
The concept of d-separation holds a pivotal role in causality theory, serving as a fundamental tool for deriving conditional independence properties from causal graphs. Pearl defined the d-separation of two subsets conditionally on a third…
The effectiveness of advertising in e-commerce largely depends on the ability of merchants to bid on and win impressions for their targeted users. The bidding procedure is highly complex due to various factors such as market competition,…
We introduce a class of Bayesian bidding games for which we prove that the set of pure Nash equilibria is a (non-empty) sublattice and we give a sufficient condition for uniqueness that is often verified in the context of markets with…
Problem definition: Most of the display advertising inventory is sold through real-time auctions. The participants of these auctions are typically bidders (Google, Criteo, RTB House, Trade Desk for instance) who participate on behalf of…
We consider a repeated auction where the buyer's utility for an item depends on the time that elapsed since his last purchase. We present an algorithm to build the optimal bidding policy, and then, because optimal might be impractical, we…