Adam Block — Scifaro

From Curiosity to Caution: Mitigating Reward Hacking for Best-of-N with Pessimism

Inference-time compute scaling has emerged as a powerful paradigm for improving language model performance on a wide range of tasks, but the question of how best to use the additional compute remains open. A popular approach is BoN…

Machine Learning · Computer Science 2026-04-07 Zhuohao Yu , Zhiwei Steven Wu , Adam Block

Revisiting the (Sub)Optimality of Best-of-N for Inference-Time Alignment

Best-of-N (BoN) sampling is a widely used inference-time alignment method for language models, whereby N candidate responses are sampled from a reference model and the one with the highest predicted reward according to a learned reward…

Machine Learning · Computer Science 2026-03-09 Ved Sriraman , Adam Block

Partition Function Estimation under Bounded f-Divergence

We study the statistical complexity of estimating partition functions given sample access to a proposal distribution and an unnormalized density ratio for a target distribution. While partition function estimation is a classical problem,…

Machine Learning · Statistics 2026-03-02 Adam Block , Abhishek Shetty

MarkTune: Improving the Quality-Detectability Trade-off in Open-Weight LLM Watermarking

Watermarking aims to embed hidden signals in generated text that can be reliably detected when given access to a secret key. Open-weight language models pose acute challenges for such watermarking schemes because the inference-time…

Machine Learning · Computer Science 2025-12-04 Yizhou Zhao , Zhiwei Steven Wu , Adam Block

The Coverage Principle: How Pre-Training Enables Post-Training

Language models demonstrate remarkable abilities when pre-trained on large text corpora and fine-tuned for specific tasks, but how and why pre-training shapes the success of the final model remains poorly understood. Notably, although…

Machine Learning · Statistics 2025-10-23 Fan Chen , Audrey Huang , Noah Golowich , Sadhika Malladi , Adam Block , Jordan T. Ash , Akshay Krishnamurthy , Dylan J. Foster

Stellar tidal streams around nearby spiral galaxies with deep imaging from amateur telescopes

Tidal interactions between massive galaxies and their satellites are fundamental processes in a Universe with L-Cold Dark Matter cosmology, redistributing material into faint features that preserve records of past galactic interactions.…

Astrophysics of Galaxies · Physics 2025-09-17 David Martinez-Delgado , Michael Stein , Joanna D. Sakowska , M. Maurice Weigelt , Javier Roman , Giuseppe Donatiello , Santi Roca-Fabrega , Mischa Schirmer , Eva K. Grebel , Teymoor Saifollahi , Jeff Kanipe , M. Angeles Gomez-Flechoso , Mohammad Akhlaghi , Behnam Javanmardi , Gang Wu , Sepideh Eskandarlou , Dominik J. Bomans , Cristian Henkel , Adam Block , Mark Hanson , Johannes Schedler , Karel Teuwen , R. Jay GaBany , Alvaro Ibañez Perez , Ken Crawford , Wolfgang Promper , Manuel Jimenez , Silvia Farras-Aloy , Juan Miro-Carretero

A Theory of Learning with Autoregressive Chain of Thought

For a given base class of sequence-to-next-token generators, we consider learning prompt-to-answer mappings obtained by iterating a fixed, time-invariant generator for multiple steps, thus generating a chain-of-thought, and then taking the…

Machine Learning · Statistics 2025-08-12 Nirmit Joshi , Gal Vardi , Adam Block , Surbhi Goel , Zhiyuan Li , Theodor Misiakiewicz , Nathan Srebro

EMA Without the Lag: Bias-Corrected Iterate Averaging Schemes

Stochasticity in language model fine-tuning, often caused by the small batch sizes typically used in this regime, can destabilize training by introducing large oscillations in generation quality. A popular approach to mitigating this…

Machine Learning · Computer Science 2025-08-04 Adam Block , Cyril Zhang

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment

Inference-time computation offers a powerful axis for scaling the performance of language models. However, naively increasing computation in techniques like Best-of-N sampling can lead to performance degradation due to reward hacking.…

Artificial Intelligence · Computer Science 2025-04-09 Audrey Huang , Adam Block , Qinghua Liu , Nan Jiang , Akshay Krishnamurthy , Dylan J. Foster

Computational-Statistical Tradeoffs at the Next-Token Prediction Barrier: Autoregressive and Imitation Learning under Misspecification

Next-token prediction with the logarithmic loss is a cornerstone of autoregressive sequence modeling, but, in practice, suffers from error amplification, where errors in the model compound and generation quality degrades as sequence length…

Machine Learning · Computer Science 2025-02-19 Dhruv Rohatgi , Adam Block , Audrey Huang , Akshay Krishnamurthy , Dylan J. Foster

Small Loss Bounds for Online Learning Separated Function Classes: A Gaussian Process Perspective

In order to develop practical and efficient algorithms while circumventing overly pessimistic computational lower bounds, recent work has been interested in developing oracle-efficient algorithms in a variety of learning settings. Two such…

Machine Learning · Computer Science 2025-02-17 Adam Block , Abhishek Shetty

Rate of convergence of the smoothed empirical Wasserstein distance

Consider an empirical measure $\mathbb{P}_n$ induced by $n$ iid samples from a $d$-dimensional $K$-subgaussian distribution $\mathbb{P}$ and let $\gamma = N(0,\sigma^2 I_d)$ be the isotropic Gaussian measure. We study the speed of…

Probability · Mathematics 2025-02-11 Adam Block , Zeyu Jia , Yury Polyanskiy , Alexander Rakhlin

GaussMark: A Practical Approach for Structural Watermarking of Language Models

Recent advances in Large Language Models (LLMs) have led to significant improvements in natural language processing tasks, but their ability to generate human-quality text raises significant ethical and operational concerns in settings…

Cryptography and Security · Computer Science 2025-01-27 Adam Block , Ayush Sekhari , Alexander Rakhlin

Self-Improvement in Language Models: The Sharpening Mechanism

Recent work in language modeling has raised the possibility of self-improvement, where a language models evaluates and refines its own generations to achieve higher performance without external feedback. It is impossible for this…

Artificial Intelligence · Computer Science 2024-12-05 Audrey Huang , Adam Block , Dylan J. Foster , Dhruv Rohatgi , Cyril Zhang , Max Simchowitz , Jordan T. Ash , Akshay Krishnamurthy

Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning

Imitation learning (IL) aims to mimic the behavior of an expert in a sequential decision making task by learning from demonstrations, and has been widely applied to robotics, autonomous driving, and autoregressive text generation. The…

Machine Learning · Computer Science 2024-12-03 Dylan J. Foster , Adam Block , Dipendra Misra

Oracle-Efficient Smoothed Online Learning for Piecewise Continuous Decision Making

Smoothed online learning has emerged as a popular framework to mitigate the substantial loss in statistical and computational complexity that arises when one moves from classical to adversarial learning. Unfortunately, for some spaces, it…

Machine Learning · Statistics 2024-03-20 Adam Block , Alexander Rakhlin , Max Simchowitz

Smoothed Online Learning for Prediction in Piecewise Affine Systems

The problem of piecewise affine (PWA) regression and planning is of foundational importance to the study of online learning, control, and robotics, where it provides a theoretically and empirically tractable setting to study systems…

Machine Learning · Statistics 2024-03-20 Adam Block , Max Simchowitz , Russ Tedrake

Efficient Model-Free Exploration in Low-Rank MDPs

A major challenge in reinforcement learning is to develop practical, sample-efficient algorithms for exploration in high-dimensional domains where generalization and function approximation is required. Low-Rank Markov Decision Processes --…

Machine Learning · Computer Science 2024-03-01 Zakaria Mhammedi , Adam Block , Dylan J. Foster , Alexander Rakhlin

The Sample Complexity of Approximate Rejection Sampling with Applications to Smoothed Online Learning

Suppose we are given access to $n$ independent samples from distribution $\mu$ and we wish to output one of them with the goal of making the output distributed as close as possible to a target distribution $\nu$. In this work we show that…

Machine Learning · Statistics 2024-02-27 Adam Block , Yury Polyanskiy

On the Performance of Empirical Risk Minimization with Smoothed Data

In order to circumvent statistical and computational hardness results in sequential decision-making, recent work has considered smoothed online learning, where the distribution of data at each time is assumed to have bounded likeliehood…

Machine Learning · Statistics 2024-02-26 Adam Block , Alexander Rakhlin , Abhishek Shetty