Yanjun Han — Scifaro

The (Marginal) Value of a Search Ad: An Online Causal Framework for Repeated Second-price Auctions

Existing auto-bidding algorithms in digital advertising often treat the value of an ad opportunity as the revenue obtained when an ad is shown and/or clicked, and bid accordingly. This can lead to wasteful spending because the true value is…

Computer Science and Game Theory · Computer Science 2026-05-05 Yuxiao Wen , Zihao Hu , Yanjun Han , Yuan Yao , Zhengyuan Zhou

Joint Value Estimation and Bidding in Repeated First-Price Auctions

We study regret minimization in repeated first-price auctions (FPAs), where a bidder observes only the realized outcome after each auction -- win or loss. This setup reflects practical scenarios in online display advertising where the…

Machine Learning · Computer Science 2026-03-19 Yuxiao Wen , Yanjun Han , Zhengyuan Zhou

An Empirical Bayes Perspective on Heteroskedastic Mean Estimation

Towards understanding the fundamental limits of estimation from data of varied quality, we study the problem of estimating a mean parameter from heteroskedastic Gaussian observations where the variances are unknown and may vary arbitrarily…

Statistics Theory · Mathematics 2026-03-17 Yanjun Han , Abhishek Shetty , Jacob Shkrob

Interactive Learning of Single-Index Models via Stochastic Gradient Descent

Stochastic gradient descent (SGD) is a cornerstone algorithm for high-dimensional optimization, renowned for its empirical successes. Recent theoretical advances have provided a deep understanding of how SGD enables feature learning in…

Machine Learning · Statistics 2026-02-23 Nived Rajaraman , Yanjun Han

PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency

Test-time scaling can improve model performance by aggregating stochastic reasoning trajectories. However, achieving sample-efficient test-time self-consistency under a limited budget remains an open challenge. We introduce PETS (Principled…

Machine Learning · Computer Science 2026-02-20 Zhangyi Liu , Huaizhi Qu , Xiaowei Yin , He Sun , Yanjun Han , Tianlong Chen , Zhun Deng

Universal priors: solving empirical Bayes via Bayesian inference and pretraining

We theoretically justify the recent empirical finding of [Teh et al., 2025] that a transformer pretrained on synthetically generated data achieves strong performance on empirical Bayes (EB) problems. We take an indirect approach to this…

Machine Learning · Statistics 2026-02-18 Nick Cannella , Anzo Teh , Yanjun Han , Yury Polyanskiy

Minimax optimal testing by classification

This paper considers an ML inspired approach to hypothesis testing known as classifier/classification-accuracy testing ($\mathsf{CAT}$). In $\mathsf{CAT}$, one first trains a classifier by feeding it labeled synthetic samples generated by…

Statistics Theory · Mathematics 2025-11-25 Patrik Róbert Gerber , Yanjun Han , Yury Polyanskiy

Optimal Arm Elimination Algorithms for Combinatorial Bandits

Combinatorial bandits extend the classical bandit framework to settings where the learner selects multiple arms in each round, motivated by applications such as online recommendation and assortment optimization. While extensions of upper…

Machine Learning · Computer Science 2025-10-29 Yuxiao Wen , Yanjun Han , Zhengyuan Zhou

Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits

We study the evolution of information in interactive decision making through the lens of a stochastic multi-armed bandit problem. Focusing on a fundamental example where a unique optimal arm outperforms the rest by a fixed margin, we…

Machine Learning · Statistics 2025-10-23 Yuzhou Gu , Yanjun Han , Jian Qian

Learning to Bid Optimally and Efficiently in Adversarial First-price Auctions

First-price auctions have very recently swept the online advertising industry, replacing second-price auctions as the predominant auction mechanism on many platforms. This shift has brought forth important challenges for a bidder: how…

Machine Learning · Computer Science 2025-09-26 Yanjun Han , Zhengyuan Zhou , Aaron Flores , Erik Ordentlich , Tsachy Weissman

Sharp mean-field analysis of permutation mixtures and permutation-invariant decisions

We develop sharp bounds on the statistical distance between high-dimensional permutation mixtures and their i.i.d. counterparts. Our approach establishes a new geometric link between the spectrum of a complex channel overlap matrix and the…

Statistics Theory · Mathematics 2025-09-17 Yiguo Liang , Yanjun Han

Approximate independence of permutation mixtures

We prove bounds on statistical distances between high-dimensional exchangeable mixture distributions (which we call \emph{permutation mixtures}) and their i.i.d. counterparts. Our results are based on a novel method for controlling $\chi^2$…

Statistics Theory · Mathematics 2025-09-17 Yanjun Han , Jonathan Niles-Weed

Besting Good--Turing: Optimality of Non-Parametric Maximum Likelihood for Distribution Estimation

When faced with a small sample from a large universe of possible outcomes, scientists often turn to the venerable Good--Turing estimator. Despite its pedigree, however, this estimator comes with considerable drawbacks, such as the need to…

Statistics Theory · Mathematics 2025-09-10 Yanjun Han , Jonathan Niles-Weed , Yandi Shen , Yihong Wu

Reconfigurable nonlinear optical computing device for retina-inspired computing

Optical neural networks are at the forefront of computational innovation, utilizing photons as the primary carriers of information and employing optical components for computation. However, the fundamental nonlinear optical device in the…

Optics · Physics 2025-02-11 Xiayang Hua , Jiyuan Zheng , Peiyuan Zhao , Hualong Ren , Xiangwei Zeng , Zhibiao Hao , Changzheng Sun , Bing Xiong , Yanjun Han , Jian Wang , Hongtao Li , Lin Gan , Yi Luo , Lai Wang

D-band MUTC Photodiode Module for Ultra-Wideband 160 Gbps Photonics-Assisted Fiber-THz Integrated Communication System

Current wireless communication systems are increasingly constrained by insufficient bandwidth and limited power output, impeding the achievement of ultra-high-speed data transmission. The terahertz (THz) range offers greater bandwidth, but…

Optics · Physics 2024-12-13 Yuxin Tian , Yaxuan Li , Bing Xiong , Junwen Zhang , Changzheng Sun , Zhibiao Hao , Jian Wang , Lai Wang , Yanjun Han , Hongtao Li , Lin Gan , Nan Chi , Yi Luo

Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

We develop a unifying framework for information-theoretic lower bound in statistical estimation and interactive decision making. Classical lower bound techniques -- such as Fano's method, Le Cam's method, and Assouad's lemma -- are central…

Machine Learning · Computer Science 2024-12-10 Fan Chen , Dylan J. Foster , Yanjun Han , Jian Qian , Alexander Rakhlin , Yunbei Xu

Ultra-High-Efficiency Dual-Band Thin-Film Lithium Niobate Modulator Incorporating Low-k Underfill with 220 GHz Extrapolated Bandwidth for 390 Gbit/s PAM8 Transmission

High-performance electro-optic modulators play a critical role in modern telecommunication networks and intra-datacenter interconnects. Low driving voltage, large electro-optic bandwidth, compact device size, and multi-band operation…

Optics · Physics 2024-11-25 Hao Liu , Yutong He , Bing Xiong , Changzheng Sun , Zhibiao Hao , Lai Wang , Jian Wang , Yanjun Han , Hongtao Li , Lin Gan , Yi Luo

Stochastic contextual bandits with graph feedback: from independence number to MAS number

We consider contextual bandits with graph feedback, a class of interactive learning problems with richer structures than vanilla contextual bandits, where taking an action reveals the rewards for all neighboring actions in the feedback…

Machine Learning · Computer Science 2024-11-08 Yuxiao Wen , Yanjun Han , Zhengyuan Zhou

On the Statistical Complexity of Sample Amplification

The ``sample amplification'' problem formalizes the following question: Given $n$ i.i.d. samples drawn from an unknown distribution $P$, when is it possible to produce a larger set of $n+m$ samples which cannot be distinguished from $n+m$…

Statistics Theory · Mathematics 2024-09-19 Brian Axelrod , Shivam Garg , Yanjun Han , Vatsal Sharan , Gregory Valiant

Causal Inference with High-dimensional Discrete Covariates

When estimating causal effects from observational studies, researchers often need to adjust for many covariates to deconfound the non-causal relationship between exposure and outcome, among which many covariates are discrete. The behavior…

Statistics Theory · Mathematics 2024-05-07 Zhenghao Zeng , Sivaraman Balakrishnan , Yanjun Han , Edward H. Kennedy