机器学习
Stochastic gradient descent (SGD), one of the most fundamental optimization algorithms in machine learning (ML), can be recast through a continuous-time approximation as a Fokker-Planck equation for Langevin dynamics, a viewpoint that has…
Data fusion and transfer learning are rapidly growing fields that enhance model performance for a target population by leveraging other related data sources or tasks. The challenges lie in the various potential heterogeneities between the…
Improved computational power has enabled different disciplines to predict causal relationships among modeled variables using Bayesian network inference. While many alternative algorithms have been proposed to improve the efficiency and…
Determining the optimal data mixture for large language model training remains a challenging problem with an outsized impact on performance. In practice, language model developers continue to rely on heuristic exploration since no…
We present the Context-Content Uncertainty Principle (CCUP), a unified framework that models cognition as the directed flow of information between high-entropy context and low-entropy content. Inference emerges as a cycle of bidirectional…
We derived the closed-form asymptotic optimism of linear regression models under random designs, and generalizes it to kernel ridge regression. Using scaled asymptotic optimism as a generic predictive model complexity measure, we studied…
We study the linear bandit problem that accounts for partially observable features. Without proper handling, unobserved features can lead to linear regret in the decision horizon $T$, as their influence on rewards is unknown. To tackle this…
Mean-field Langevin dynamics (MFLD) is an optimization method derived by taking the mean-field limit of noisy gradient descent for two-layer neural networks in the mean-field regime. Recently, the propagation of chaos (PoC) for MFLD has…
Neural Jump ODEs model the conditional expectation between observations by neural ODEs and jump at arrival of new observations. They have demonstrated effectiveness for fully data-driven online forecasting in settings with irregular and…
Flow matching has recently emerged as a powerful paradigm for generative modeling and has been extended to probabilistic time series forecasting in latent spaces. However, the impact of the specific choice of probability path model on…
In the Naive Bayes classification model the class conditional densities are estimated as the products of their marginal densities along the cardinal basis directions. We study the problem of obtaining an alternative basis for this…
Analysis of high-dimensional data has led to increased interest in both single index models (SIMs) and the best-subset selection. SIMs provide an interpretable and flexible modeling framework for high-dimensional data, while the best-subset…
The paper proposes a systematic framework for building data-driven stochastic differential equation (SDE) models from sparse, noisy observations. Unlike traditional parametric approaches, which assume a known functional form for the drift,…
Learning and generating various types of data based on conditional diffusion models has been a research hotspot in recent years. Although conditional diffusion models have made considerable progress in improving acceleration algorithms and…
Solving partial differential equations (PDEs) within the framework of probabilistic numerics offers a principled approach to quantifying epistemic uncertainty arising from discretization. By leveraging Gaussian process regression and…
This paper deals with the identification of linear stochastic dynamical systems, where the unknowns include system coefficients and noise variances. Conventional approaches that rely on the maximum likelihood estimation (MLE) require…
Prediction-powered inference (PPI) enables valid statistical inference by combining experimental data with machine learning predictions. When a sufficient number of high-quality predictions is available, PPI results in more accurate…
Recent advances in statistical learning theory have revealed profound connections between mutual information (MI) bounds, PAC-Bayesian theory, and Bayesian nonparametrics. This work introduces a novel mutual information bound for…
Adequately generating and evaluating prediction models based on supervised machine learning (ML) is often challenging, especially for less experienced users in applied research areas. Special attention is required in settings where the…
Simulation-based inference methods that feature correct conditional coverage of confidence sets based on observations that have been compressed to a scalar test statistic require accurate modeling of either the p-value function or the…