机器学习
Positional encoding (PE) is a core architectural component of Transformers, yet its impact on the Transformer's generalization and robustness remains unclear. In this work, we provide the first generalization analysis for a single-layer…
This study clarifies the relationship between Riesz regression [Chernozhukov et al., 2021] and density ratio estimation (DRE) in causal inference problems, such as average treatment effect estimation. We first show that the Riesz…
Geopolitical and geoeconomic shocks reprice sovereign credit risk through different transmission channels. Using a daily panel of 42 advanced and emerging economies over 2018--2025, we show that geopolitical shocks raise sovereign CDS…
We introduce a novel framework for graph signal processing (GSP) that models signals as graph distribution-valued signals (GDSs), which are probability distributions in the Wasserstein space. This approach overcomes key limitations of…
Prediction-powered inference (PPI) is a recent framework for valid statistical inference with partially labeled data, combining model-based predictions on a large unlabeled set with bias correction from a smaller labeled subset. Building on…
Graphical models serve as effective tools for visualizing conditional dependencies between variables. However, as the number of variables grows, interpretation becomes increasingly difficult, and estimation uncertainty increases due to the…
Gaussian Graphical Models (GGMs) are widely used in high-dimensional data analysis to synthesize the interaction between variables. In many applications, such as genomics or image analysis, graphical models rely on sparsity and clustering…
The distribution of absorbed dose in radionuclide therapy with Lu$^{177}$ can be approximated by convolving an image of the time-integrated activity distribution with a dose voxel kernel representing different tissue types. This fast but…
Supervised machine learning describes the practice of fitting a parameterized model to labeled input-output data. Supervised machine learning methods have demonstrated promise in learning efficient surrogate models that can (partially)…
This paper develops a unified framework for measuring concentration in weighted systems embedded in networks of interactions. While traditional indices such as the Herfindahl-Hirschman Index capture dispersion in weights, they neglect the…
Time-dependent reliability analysis of nonlinear dynamical systems under stochastic excitations is a critical yet computationally demanding task. Conventional approaches, such as Monte Carlo simulation, necessitate repeated evaluations of…
We introduce Generalized Discrete Diffusion from Snapshots (GDDS), a unified framework for discrete diffusion modeling that supports arbitrary noising processes over large discrete state spaces. Our formulation encompasses all existing…
We propose a landmark-constrained algorithm, LA-VDM (Landmark Accelerated Vector Diffusion Maps), to accelerate the Vector Diffusion Maps (VDM) framework built upon the Graph Connection Laplacian (GCL), which captures pairwise connection…
Nonrigid registration is conventionally divided into point set registration, which aligns sparse geometries, and image registration, which aligns continuous intensity fields on regular grids. However, this dichotomy creates a critical…
This paper proposes a new formulation of functional Gaussian Process regression in manifolds, based on an Empirical Bayes approach, in the spatiotemporal random field context. We apply the machinery of tight Gaussian measures in separable…
Based on some recent work of the author on stochastic approximation in non-markovian environments, the situation when the driving random process is non-ergodic in addition to being non-markovian is considered. Using this, we propose an…
We study the problem of learning a low-degree spherical polynomial of degree $k_0 = \Theta(1) \ge 1$ defined on the unit sphere in $\RR^d$ by training an over-parameterized two-layer neural network with augmented feature in this paper. Our…
One of the most common machine learning setups is logistic regression. In many classification models, including neural networks, the final prediction is obtained by applying a logistic link function to a linear score. In binary logistic…
We highlight a striking difference in behavior between two widely used variants of coordinate ascent variational inference: the sequential and parallel algorithms. While such differences were known in the numerical analysis literature in…
Physical AI agents, such as robots and other embodied systems operating under tight and fluctuating resource constraints, remain far less capable than biological agents in open-ended real-world environments. This paper argues that Active…