机器学习
We study reinforcement learning (RL) with transition look-ahead, where the agent may observe which states would be visited upon playing any sequence of $\ell$ actions before deciding its course of action. While such predictive information…
Kernel Stein discrepancies (KSDs) have emerged as a powerful tool for quantifying goodness-of-fit over the last decade, featuring numerous successful applications. To the best of our knowledge, all existing KSD estimators with known rate…
Conformal prediction is a model-free machine learning method for constructing prediction regions at a guaranteed coverage probability level. However, a data scientist often faces three challenges in practice: (i) the determination of a…
Surrogate modeling for systems with high-dimensional quantities of interest remains challenging, particularly when training data are costly to acquire. This work develops multifidelity methods for multiple-input multiple-output linear…
To reach human level intelligence, learning algorithms need to incorporate causal reasoning. But identifying causality, and particularly counterfactual reasoning, remains elusive. In this paper, we make progress on counterfactual inference…
Using the bit string generation problem as a case study, we theoretically compare two standard methods for adapting large language models to new tasks. The first, referred to as supervised fine-tuning, involves training a new next token…
Creating large-scale datasets for training high-performance generative models is often prohibitively expensive, especially when associated attributes or annotations must be provided. As a result, merging existing datasets has become a…
We consider a method for conformalizing a stacked ensemble of predictive models, showing that the potentially simple form of the meta-learner at the top of the stack enables a procedure with manageable computational cost that achieves…
We introduce a novel particle-based algorithm for end-to-end training of latent diffusion models. We reformulate the training task as minimizing a free energy functional and obtain a gradient flow that does so. By approximating the latter…
Modern AI algorithms require labeled data. In real world, majority of data are unlabeled. Labeling the data are costly. this is particularly true for some areas requiring special skills, such as reading radiology images by physicians. To…
Precision matrix estimation is essential in various fields; yet it is challenging when samples for the target study are limited. Transfer learning can enhance estimation accuracy by leveraging data from related source studies. We propose…
Vanilla variational inference finds an optimal approximation to the Bayesian posterior distribution, but even the exact Bayesian posterior is often not meaningful under model misspecification. We propose predictive variational inference…
This paper studies a class of multivariate Kantorovich-kernel neural network operators, including the deep Kantorovich-type neural network operators studied by Sharma and Singh. We prove density results, establish quantitative convergence…
Accurate uncertainty quantification is crucial for making reliable decisions in various supervised learning scenarios, particularly when dealing with complex, multimodal data such as images and text. Current approaches often face notable…
The complex Gaussian distribution has been widely used as a fundamental spectral and noise model in signal processing and communication. However, its Gaussian structure often limits its ability to represent the diverse amplitude…
We study rank selection for low-rank tensor regression under random covariates design. Under a Gaussian random-design model and some mild conditions, we derive population expressions for the expected training-testing discrepancy (optimism)…
We provide explicit, finite-sample guarantees for learning causal representations from data with a sublinear number of environments. Causal representation learning seeks to provide a rigourous foundation for the general representation…
We propose SAHMM-VAE, a source-wise adaptive Hidden Markov prior variational autoencoder for unsupervised blind source separation. Instead of treating the latent prior as a single generic regularizer, the proposed framework assigns each…
Supervised graph prediction addresses regression problems where the outputs are structured graphs. Although several approaches exist for graph-valued prediction, principled uncertainty quantification remains limited. We propose a conformal…
Given a probability distribution $\mu$ in $\mathbb{R}^d$ represented by data, we study in this paper the generative modeling of the corresponding conditional probability distributions on the level-sets of a collective variable…