机器学习
Diffusion models have achieved remarkable success across diverse domains, but they remain vulnerable to memorization -- reproducing training data rather than generating novel outputs. This not only limits their creative potential but also…
Predictions from statistical physics postulate that recovery of the communities in Stochastic Block Model (SBM) is possible in polynomial time above, and only above, the Kesten-Stigum (KS) threshold. This conjecture has given rise to a rich…
This work advances distribution recovery from sparse data and ensemble classification through three main contributions. First, we introduce a closed-form estimator that reconstructs scaled beta distributions from limited statistics…
Structure-agnostic causal inference studies how well one can estimate a treatment effect given black-box machine learning estimates of nuisance functions (like the impact of confounders on treatment and outcomes). Here, we find that the…
Gaussian mixture models (GMMs) are ubiquitous in statistical learning, particularly for unsupervised problems. While full GMMs suffer from the overparameterization of their covariance matrices in high-dimensional spaces, spherical GMMs…
When applicants get rejected by an algorithmic decision system, recourse explanations provide actionable suggestions for how to change their input features to get a positive evaluation. A crucial yet overlooked phenomenon is that recourse…
Process reward models (PRMs) play a central role in guiding inference-time scaling algorithms for large language models (LLMs). However, we observe that even state-of-the-art PRMs can be poorly calibrated. Specifically, they tend to…
Prediction-Powered Inference (PPI) is a powerful framework for enhancing statistical estimates by combining limited gold-standard data with machine learning (ML) predictions. While prior work has demonstrated PPI's benefits for individual…
Our paper challenges claims from prior research that transformer-based models, when learning in context, implicitly implement standard learning algorithms. We present empirical evidence inconsistent with this view and provide a mathematical…
Sampling from generative models has become a crucial tool for applications like data synthesis and augmentation. Diffusion, Flow Matching and Continuous Normalising Flows have shown effectiveness across various modalities, and rely on…
A central part of geometric statistics is to compute the Fr\'echet mean. This is a well-known intrinsic mean on a Riemannian manifold that minimizes the sum of squared Riemannian distances from the mean point to all other data points. The…
Minimum-volume nonnegative matrix factorization (min-vol NMF) has been used successfully in many applications, such as hyperspectral imaging, chemical kinetics, spectroscopy, topic modeling, and audio source separation. However, its…
Conformal prediction has emerged as a powerful framework for constructing distribution-free prediction sets with guaranteed coverage assuming only the exchangeability assumption. However, this assumption is often violated in online…
We present simple, user-friendly bounds for the expected operator norm of a random kernel matrix under general conditions on the kernel function $k(\cdot,\cdot)$. Our approach uses decoupling results for U-statistics and the non-commutative…
We bring a control perspective to the problem of identifying paths of measures for sampling via dynamic measure transport (DMT). We highlight the fact that commonly used paths may be poor choices for DMT and connect existing methods for…
We present a bifidelity Karhunen-Lo\`eve expansion (KLE) surrogate model for field-valued quantities of interest (QoIs) under uncertain inputs. The approach combines the spectral efficiency of the KLE with polynomial chaos expansions (PCEs)…
Designing frictional interfaces to exhibit prescribed macroscopic behavior is a challenging inverse problem, made difficult by the non-uniqueness of solutions and the computational cost of contact simulations. Traditional approaches rely on…
Latent variables (LVs) play a crucial role in encoder-decoder models by enabling effective data compression, prediction, and generation. Although their theoretical properties, such as generalization, have been extensively studied in…
We study nonparametric regression by an over-parameterized two-layer neural network trained by gradient descent (GD) in this paper. We show that, if the neural network is trained by GD with early stopping, then the trained network renders a…
Gaussian process (GP) regression provides a flexible, nonparametric framework for probabilistic modeling, yet remains computationally demanding in large-scale applications. For one-dimensional data, state space (SS) models achieve…