机器学习
We study theoretical limits of \emph{descending} phase retrieval algorithms. Utilizing \emph{Random duality theory} (RDT) we develop a generic program that allows statistical characterization of various algorithmic performance metrics.…
We propose a novel nonparametric regression method that models complex input-output relationships as the sum of convex and concave components. The method-Identifiable Convex-Concave Nonparametric Least Squares (ICCNLS)-decomposes the target…
This monograph studies the relations between two approaches using positive definite kernels: probabilistic methods using Gaussian processes, and non-probabilistic methods using reproducing kernel Hilbert spaces (RKHS). They are widely…
Multi-well potentials are ubiquitous in science, modeling phenomena such as phase transitions, dynamic instabilities, and multimodal behavior across physics, chemistry, and biology. In contrast to non-smooth minimum-of-mixture…
Bayesian optimization of function networks (BOFN) is a framework for optimizing expensive-to-evaluate objective functions structured as networks, where some nodes' outputs serve as inputs for others. Many real-world applications, such as…
Machine Unlearning (MU) aims at removing the influence of specific data points from a trained model, striving to achieve this at a fraction of the cost of full model retraining. In this paper, we analyze the efficiency of unlearning methods…
This paper explores how theory can guide and enhance practical algorithms, using Low-Rank Adaptation (LoRA, Hu et al. 2022) in large language models as a case study. We rigorously prove that, under gradient descent, LoRA adapters align with…
Sequential Monte Carlo (SMC) samplers are powerful tools for Bayesian inference but suffer from high computational costs due to their reliance on large particle ensembles for accurate estimates. We introduce persistent sampling (PS), an…
State-of-the-art neural network training methods depend on the gradient of the network function. Therefore, they cannot be applied to networks whose activation functions do not have useful derivatives, such as binary and discrete-time…
We propose a novel generative model within the Bayesian non-parametric learning (BNPL) framework to address some notable failure modes in generative adversarial networks (GANs) and variational autoencoders (VAEs)--these being overfitting in…
Kernel Stein discrepancies (KSDs) measure the quality of a distributional approximation and can be computed even when the target density has an intractable normalizing constant. Notable applications include the diagnosis of approximate MCMC…
Subgraph isomorphism, also known as subgraph matching, is typically regarded as an NP-complete problem. This complexity is further compounded in practical applications where edge weights are real-valued and may be affected by measurement…
Synthetic Data Generation has become essential for scalable, privacy-preserving statistical analysis. While standard approaches based on generative models, such as Normalizing Flows, have been widely used, they often suffer from slow…
Machine learning systems deployed in the real world must operate under dynamic and often unpredictable distribution shifts. This challenges the validity of statistical safety assurances on the system's risk established beforehand. Common…
Random feature approximation is arguably one of the most widely used techniques for kernel methods in large-scale learning algorithms. In this work, we analyze the generalization properties of random feature methods, extending previous…
We study the problem of conformal prediction (CP) under geometric data shifts, where data samples are susceptible to transformations such as rotations or flips. While CP endows prediction models with post-hoc uncertainty quantification and…
Score-based methods have recently seen increasing popularity in modeling and generation. Methods have been constructed to perform hypothesis testing and change-point detection with score functions, but these methods are in general not as…
Operator learning offers a powerful paradigm for solving parametric partial differential equations (PDEs), but scaling probabilistic neural operators such as the recently proposed Gaussian Processes Operators (GPOs) to high-dimensional,…
We present an algorithm to sample stochastic differential equations conditioned on rather general constraints, including integral constraints, endpoint constraints, and stochastic integral constraints. The algorithm is a pathspace…
The Median of Means (MoM) is a mean estimator that has gained popularity in the context of heavy-tailed data. In this work, we analyze its performance in the task of simultaneously estimating the mean of each function in a class…