机器学习
We study the problem of learning a low-degree spherical polynomial of degree $\ell_0 = \Theta(1) \ge 1$ defined on the unit sphere in $\RR^d$ by training an over-parameterized two-layer neural network (NN) with channel attention in this…
We propose a flexible deep neural network (DNN) framework for modeling survival data within a partially linear regression structure. The approach preserves interpretability through a parametric linear component for covariates of primary…
Diffusion and flow matching approaches to generative modeling have shown promise in domains where the state space is continuous, such as image generation or protein folding & design, and discrete, exemplified by diffusion large language…
Longitudinal voice biomarkers provide a non-invasive source of information for monitoring Parkinson's disease progression, but their statistical analysis is difficult because repeated measurements from the same subject are correlated,…
In recent years, the neural tangent kernel (NTK) and neural network Gaussian process kernel (NNGP) have given theoreticians tractable limiting cases of fully connected neural networks. However, the property of these kernels are poorly…
Differentially private (DP) linear regression has received significant attention in the recent theoretical literature, with several approaches proposed to improve error rates. Our work considers the popular high-dimensional regime with…
We consider a class of statistical inverse problems involving the estimation of a regression operator from a Polish space to a separable Hilbert space, where the target lies in a vector-valued reproducing kernel Hilbert space induced by an…
Anomaly localization in images -- identifying regions that deviate from normal patterns -- is vital in applications such as medical diagnosis and industrial inspection. A recent trend is the use of image generation models in anomaly…
Bayesian optimisation is a popular method for efficient optimisation of expensive black-box functions. Traditionally, BO assumes that the search space is known. However, in many problems, this assumption does not hold. To this end, we…
Scaling Bayesian optimisation (BO) to high-dimensional search spaces is a active and open research problems particularly when no assumptions are made on function structure. The main reason is that at each iteration, BO requires to find…
Predicting customers' long-term revenue from sparse and irregular transaction data is central to marketing resource allocation in non-contractual settings, yet existing approaches face a trade-off. Traditional probabilistic customer base…
The Gaussian mixture model is widely used in unsupervised learning, owing to its simplicity and interpretability. However, a fundamental limitation of the classical Gaussian mixture model is that it forces each observation to belong to…
As the demand to integrate Artificial Intelligence into high-stakes environments continues to grow, explaining the reasoning behind neural-network predictions has shifted from a theoretical curiosity to a strict operational requirement. Our…
We introduce two federated learning frameworks for the classical SPDnet model operating on symmetric positive definite (SPD) matrices with Stiefel-constrained parameters. Unlike standard Euclidean averaging, which violates orthogonality,…
The Super Learner (SL) is a widely used ensemble method that combines predictions from a library of learners based on their predictive performance. Interval predictions are of considerable practical interest because they allow uncertainty…
One of the major limits of kernel ridge regression (KRR) is that storing and manipulating the kernel matrix K_n for n samples requires O(n^2) space, which rapidly becomes unfeasible for large n. Nystrom approximations reduce the space…
Rejection sampling is a technique for sampling from difficult distributions. However, its use is limited due to a high rejection rate. Common adaptive rejection sampling methods either work only for very specific distributions or without…
Recovering latent structure from count data has received considerable attention in network inference, particularly when one seeks both cross-group interactions and within-group similarity patterns in bipartite networks, which is widely used…
This paper investigates the off-policy evaluation (OPE) problem from a distributional perspective. Rather than focusing solely on the expectation of the total return, as in most existing OPE methods, we aim to estimate the entire return…
Bayesian filtering and smoothing for high-dimensional nonlinear dynamical systems are fundamental yet challenging problems in many areas of science and engineering. In this work, we propose FLUID, a flow-based unified amortized inference…