机器学习
Probabilistic Graphical Models (PGMs) encode conditional dependencies among random variables using a graph -nodes for variables, links for dependencies- and factorize the joint distribution into lower-dimensional components. This makes PGMs…
We frame novelty detection on path space as a hypothesis testing problem with signature-based test statistics. Using transportation-cost inequalities of Gasteratos and Jacquier (2023), we obtain tail bounds for false positive rates that…
We introduce iterative tilting, a gradient-free method for fine-tuning diffusion models toward reward-tilted distributions. The method decomposes a large reward tilt $\exp(\lambda r)$ into $N$ sequential smaller tilts, each admitting a…
We study estimation and statistical inference for reward models used in aligning large language models (LLMs). A key component of LLM alignment is reinforcement learning from human feedback (RLHF), where humans compare pairs of…
We prove an impossibility result for conditional Probably Approximately Correct (PAC)-efficient reasoning in large language models. While recent work has established marginal PAC efficiency guarantees for composite models that switch…
Sequential optimization of black-box functions from noisy evaluations has been widely studied, with Gaussian Process bandit algorithms such as GP-UCB guaranteeing no-regret in stationary settings. However, for time-varying objectives, it is…
Generative modeling is typically framed as learning mapping rules, but from an observer's perspective without access to these rules, the task becomes disentangling the geometric support from the probability distribution. We propose that…
Despite the growing availability of Electronic Health Record (EHR) data, researchers often face substantial barriers in effectively using these data for translational research due to their complexity, heterogeneity, and lack of standardized…
Operator models are regression algorithms between Banach spaces of functions. They have become an increasingly critical tool for spatiotemporal forecasting and physics emulation, especially in high-stakes scenarios where robust, calibrated…
Conformal prediction methods are statistical tools designed to quantify uncertainty and generate predictive sets with guaranteed coverage probabilities. This work introduces an innovative refinement to these methods for classification…
Contrastive learning is among the most popular and powerful approaches for self-supervised representation learning, where the goal is to map semantically similar samples close together while separating dissimilar ones in the latent space.…
To address the scalability limitations of Gaussian process (GP) regression, several approximation techniques have been proposed. One such method is based on tensor networks, which utilizes an exponential number of basis functions without…
Inverse problems arise across scientific and engineering domains, where the goal is to infer hidden parameters or physical fields from indirect and noisy observations. Classical approaches, such as variational regularization and Bayesian…
We study Sparse Multiple Kernel Learning (SMKL), which is the problem of selecting a sparse convex combination of prespecified kernels for support vector binary classification. Unlike prevailing l1 regularized approaches that approximate a…
Front-door adjustment gives a simple closed-form identification formula under the classical front-door criterion, but its applicability is often viewed as narrow. By contrast, the general ID algorithm can identify many more causal effects…
We study a missing-value imputation method, termed kNNSampler, that imputes a given unit's missing response by randomly sampling from the observed responses of the $k$ most similar units to the given unit in terms of the observed…
Selecting artificial intelligence (AI) models, such as large language models (LLMs), from multiple candidates requires accurate performance estimation. This is ideally achieved through empirical evaluations involving abundant real-world…
Change point detection (CPD) and anomaly detection (AD) are essential techniques in various fields to identify abrupt changes or abnormal data instances. However, existing methods are often constrained to univariate data, face scalability…
We introduce so-called functional input neural networks defined on a possibly infinite dimensional weighted space with values also in a possibly infinite dimensional output space. To this end, we use an additive family to map the input…
This chapter opens with a review of classic tools for regression, a subset of machine learning that seeks to find relationships between variables. With the advent of scientific machine learning this field has moved from a purely data-driven…