机器学习
It has become increasingly common for data to be collected adaptively, for example using contextual bandits. Historical data of this type can be used to evaluate other treatment assignment policies to guide future innovation or experiments.…
We study the in-context learning (ICL) capabilities of pretrained Transformers in the setting of nonlinear regression. Specifically, we focus on a random Transformer with a nonlinear MLP head where the first layer is randomly initialized…
This paper proposes the Next-Depth Lookahead Tree (NDLT), a single-tree model designed to improve performance by evaluating node splits not only at the node being optimized but also by evaluating the quality of the next depth level.
Empirical Risk Minimization (ERM) is a foundational framework for supervised learning but primarily optimizes average-case performance, often neglecting fairness and robustness considerations. Tilted Empirical Risk Minimization (TERM)…
We investigate the impact of high-order moments on the learning dynamics of an online Independent Component Analysis (ICA) algorithm under a high-dimensional data model composed of a weighted sum of two non-Gaussian random variables. This…
We present the first gap-dependent analysis of regret and communication cost for on-policy federated $Q$-Learning in tabular episodic finite-horizon Markov decision processes (MDPs). Existing FRL methods focus on worst-case scenarios,…
We consider ordinary least squares estimation and variations on least squares estimation such as penalized (regularized) least squares and spectral shrinkage estimates for problems with p > n and associated problems with prediction of new…
In this paper, we consider the problem of Gaussian approximation for the online linear regression task. We derive the corresponding rates for the setting of a constant learning rate and study the explicit dependence of the convergence rate…
Estimating heterogeneous treatment effects is critical in domains such as personalized medicine, resource allocation, and policy evaluation. A central challenge lies in identifying subpopulations that respond differently to interventions,…
Modeling the dynamics of probability distributions from time-dependent data samples is a fundamental problem in many fields, including digital health. The goal is to analyze how the distribution of a biomarker, such as glucose, changes over…
We introduce Soft Kernel Interpolation (SoftKI), a method that combines aspects of Structured Kernel Interpolation (SKI) and variational inducing point methods, to achieve scalable Gaussian Process (GP) regression on high-dimensional…
The sparse-group lasso performs both variable and group selection, simultaneously using the strengths of the lasso and group lasso. It has found widespread use in genetics, a field that regularly involves the analysis of high-dimensional…
We present a direct inverse modeling method named SURGIN, a SURrogate-guided Generative INversion framework tailed for subsurface multiphase flow data assimilation. Unlike existing inversion methods that require adaptation for each new…
The Wasserstein barycenter extends the Euclidean mean to the space of probability measures by minimizing the weighted sum of squared 2-Wasserstein distances. We develop a free-support algorithm for computing Wasserstein barycenters that…
This paper studies high-dimensional additive regression under the transfer learning framework, where one observes samples from a target population together with auxiliary samples from different but potentially related regression models. We…
Despite significant research on the optimization aspects of federated learning, the exploration of generalization error, especially in the realm of heterogeneous federated learning, remains an area that has been insufficiently investigated,…
Recent literature has explored various ways to improve soft sensors by utilizing learning algorithms with transferability. A performance gain is generally attained when knowledge is transferred among strongly related soft sensor learning…
Model selection in non-linear models often prioritizes performance metrics over statistical tests, limiting the ability to account for sampling variability. We propose the use of a statistical test to assess the equality of variances in…
Multivariate longitudinal data of mixed-type are increasingly collected in many science domains. However, algorithms to cluster this kind of data remain scarce, due to the challenge to simultaneously model the within- and between-time…
We propose the Entropic-regularized Robust Optimal Transport (E-ROBOT) framework, a novel method that combines the robustness of ROBOT with the computational and statistical benefits of entropic regularization. We show that, rooted in the…