Related papers: Nonparametric estimation of a factorizable density…
Multivariate distributions often carry latent structures that are difficult to identify and estimate, and which better reflect the data generating mechanism than extrinsic structures exhibited simply by the raw data. In this paper, we…
This paper investigates the score-based diffusion models for density estimation when the target density admits a factorizable low-dimensional nonparametric structure. To be specific, we show that when the log density admits a $d^*$-way…
Statistical inference in high dimensional settings has recently attracted enormous attention within the literature. However, most published work focuses on the parametric linear regression problem. This paper considers an important…
Implicit sampling is a weighted sampling method that is used in data assimilation, where one sequentially updates estimates of the state of a stochastic model based on a stream of noisy or incomplete data. Here we describe how to use…
Neural network-based methods for (un)conditional density estimation have recently gained substantial attention, as various neural density estimators have outperformed classical approaches in real-data experiments. Despite these empirical…
Probabilistic regression models the entire predictive distribution of a response variable, offering richer insights than classical point estimates and directly allowing for uncertainty quantification. While diffusion-based generative models…
This paper studies the problems of identifiability and estimation in high-dimensional nonparametric latent structure models. We introduce an identifiability theorem that generalizes existing conditions, establishing a unified framework…
Financial scenario simulation is essential for risk management and portfolio optimization, yet it remains challenging especially in high-dimensional and small data settings common in finance. We propose a diffusion factor model that…
Statistical inference from high-dimensional data with low-dimensional structures has recently attracted lots of attention. In machine learning, deep generative modeling approaches implicitly estimate distributions of complex objects by…
Statistical learning in high-dimensional spaces is challenging without a strong underlying data structure. Recent advances with foundational models suggest that text and image data contain such hidden structures, which help mitigate the…
A deep generative model yields an implicit estimator for the unknown distribution or density function of the observation. This paper investigates some statistical properties of the implicit density estimator pursued by VAE-type methods from…
We investigate statistical properties of a likelihood approach to nonparametric estimation of a singular distribution using deep generative models. More specifically, a deep generative model is used to model high-dimensional data that are…
We investigate the approximation efficiency of score functions by deep neural networks in diffusion-based generative modeling. While existing approximation theories utilize the smoothness of score functions, they suffer from the curse of…
We study parameter estimation in Nonlinear Factor Analysis (NFA) where the generative model is parameterized by a deep neural network. Recent work has focused on learning such models using inference (or recognition) networks; we identify a…
This paper introduces a Factor Augmented Sparse Throughput (FAST) model that utilizes both latent factors and sparse idiosyncratic components for nonparametric regression. The FAST model bridges factor models on one end and sparse…
In this study we develop dimension-reduction techniques to accelerate diffusion model inference in the context of synthetic data generation. The idea is to integrate compressed sensing into diffusion models (hence, CSDM): First, compress…
It is now practically the norm for data to be very high dimensional in areas such as genetics, machine vision, image analysis and many others. When analyzing such data, parametric models are often too inflexible while nonparametric…
Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models,…
Geological parameterization entails the representation of a geomodel using a small set of latent variables and a mapping from these variables to grid-block properties such as porosity and permeability. Parameterization is useful for data…
Effective non-parametric density estimation is a key challenge in high-dimensional multivariate data analysis. In this paper,we propose a novel approach that builds upon tensor factorization tools. Any multivariate density can be…