机器学习 — Scifaro

Preconditioned One-Step Generative Modeling for Bayesian Inverse Problems in Function Spaces

We propose a machine-learning algorithm for Bayesian inverse problems in the function-space regime based on one-step generative transport. Building on the Mean Flows, we learn a fully conditional amortized sampler with a neural-operator…

机器学习 · 统计学 2026-03-17 Zilan Cheng , Li-Lian Wang , Zhongjian Wang

Power-Law Spectrum of the Random Feature Model

Scaling laws for neural networks, in which the loss decays as a power-law in the number of parameters, data, and compute, depend fundamentally on the spectral structure of the data covariance, with power-law eigenvalue decay appearing…

机器学习 · 统计学 2026-03-17 Elliot Paquette , Ke Liang Xiao , Yizhe Zhu

Convergence of Two Time-Scale Stochastic Approximation: A Martingale Approach

In this paper, we analyze the two time-scale stochastic approximation (TTSSA) algorithm introduced in Borkar (1997) using a martingale approach. This approach leads to simple sufficient conditions for the iterations to be bounded almost…

机器学习 · 统计学 2026-03-17 Mathukumalli Vidyasagar

An Interpretable and Stable Framework for Sparse Principal Component Analysis

Sparse principal component analysis (SPCA) addresses the poor interpretability and variable redundancy often encountered by principal component analysis (PCA) in high-dimensional data. However, SPCA typically imposes uniform penalties on…

机器学习 · 统计学 2026-03-17 Ying Hu , Hu Yang

When Should Humans Step In? Optimal Human Dispatching in AI-Assisted Decisions

AI systems increasingly assist human decision making by producing preliminary assessments of complex inputs. However, such AI-generated assessments can often be noisy or systematically biased, raising a central question: how should costly…

机器学习 · 统计学 2026-03-17 Lezhi Tan , Naomi Sagan , Lihua Lei , Jose Blanchet

Robust Sequential Tracking via Bounded Information Geometry and Non-Parametric Field Actions

Standard sequential inference architectures are compromised by a normalizability crisis when confronted with extreme, structured outliers. By operating on unbounded parameter spaces, state-of-the-art estimators lack the intrinsic geometry…

机器学习 · 统计学 2026-03-17 Carlos C. Rodriguez

Robust Automatic Differentiation of Square-Root Kalman Filters via Gramian Differentials

Square-root Kalman filters propagate state covariances in Cholesky-factor form for numerical stability, and are a natural target for gradient-based parameter learning in state-space models. Their core operation, triangularization of a…

机器学习 · 统计学 2026-03-17 Adrien Corenflos

Holographic Invariant Storage: Design-Time Safety Contracts via Vector Symbolic Architectures

We introduce Holographic Invariant Storage (HIS), a protocol that assembles known properties of bipolar Vector Symbolic Architectures into a design-time safety contract for LLM context-drift mitigation. The contract provides three…

机器学习 · 统计学 2026-03-17 Arsenios Scrivens

Standard Acquisition Is Sufficient for Asynchronous Bayesian Optimization

Asynchronous Bayesian optimization is widely used for gradient-free optimization in domains with independent parallel experiments and varying evaluation times. Existing methods posit that standard acquisitions lead to redundant and repeated…

机器学习 · 统计学 2026-03-17 Ben Riegler , James Odgers , Vincent Fortuin

A Hybrid Tsallis-Polarization Impurity Measure for Decision Trees: Theoretical Foundations and Empirical Evaluation

We introduce the Integrated Tsallis Combination (ITC), a hybrid impurity measure for decision tree learning that combines normalized Tsallis entropy with an exponential polarization component. While many existing measures sacrifice…

机器学习 · 统计学 2026-03-17 Edouard Lansiaux , Idriss Jairi , Hayfa Zgaya-Biau

Kernel Tests of Equivalence

We propose novel kernel-based tests for assessing the equivalence between distributions. Traditional goodness-of-fit testing is inappropriate for concluding the absence of distributional differences, because failure to reject the null…

机器学习 · 统计学 2026-03-17 Xing Liu , Axel Gandy

Structural Causal Bottleneck Models

We introduce structural causal bottleneck models (SCBMs), a novel class of structural causal models. At the core of SCBMs lies the assumption that causal effects between high-dimensional variables only depend on low-dimensional summary…

机器学习 · 统计学 2026-03-17 Simon Bing , Jonas Wahl , Jakob Runge

On the Statistical Optimality of Optimal Decision Trees

While globally optimal empirical risk minimization (ERM) decision trees have become computationally feasible and empirically successful, rigorous theoretical guarantees for their statistical performance remain limited. In this work, we…

机器学习 · 统计学 2026-03-17 Zineng Xu , Subhro Ghosh , Yan Shuo Tan

Transfer Learning with Distance Covariance for Random Forest: Error Bounds and an EHR Application

We propose a method for transfer learning in nonparametric regression using a random forest (RF) with distance covariance-based feature weights, assuming the unknown source and target regression functions are sparsely different. Our method…

机器学习 · 统计学 2026-03-17 Chenze Li , Subhadeep Paul

When Scores Learn Geometry: Rate Separations under the Manifold Hypothesis

Score-based methods, such as diffusion models and Bayesian inverse problems, are often interpreted as learning the data distribution in the low-noise limit ($\sigma \to 0$). In this work, we propose an alternative perspective: their success…

机器学习 · 统计学 2026-03-17 Xiang Li , Zebang Shen , Ya-Ping Hsieh , Niao He

Disentangled Feature Importance

Feature importance (FI) measures are widely used to assess the contributions of predictors to an outcome, but they may target different notions of relevance. When predictors are correlated, traditional statistical FI methods are often…

机器学习 · 统计学 2026-03-17 Jin-Hong Du , Kathryn Roeder , Larry Wasserman

Convergence and clustering analysis for Mean Shift with radially symmetric, positive definite kernels

The mean shift (MS) is a non-parametric, density-based, iterative algorithm with prominent usage in clustering and image segmentation. A rigorous proof for the convergence of its mode estimate sequence in full generality remains unknown. In…

机器学习 · 统计学 2026-03-17 Susovan Pal

Stable Thompson Sampling: Valid Inference via Variance Inflation

We consider the problem of statistical inference when the data is collected via a Thompson Sampling-type algorithm. While Thompson Sampling (TS) is known to be both asymptotically optimal and empirically effective, its adaptive sampling…

机器学习 · 统计学 2026-03-17 Budhaditya Halder , Shubhayan Pan , Koulik Khamaru

Amortized Bayesian Mixture Models

Finite mixtures are a broad class of models useful in scenarios where observed data is generated by multiple distinct processes but without explicit information about the responsible process for each data point. Estimating Bayesian mixture…

机器学习 · 统计学 2026-03-17 Šimon Kucharský , Paul Christian Bürkner

A Note on Estimation Error Bound and Grouping Effect of Transfer Elastic Net

The Transfer Elastic Net is an estimation method for linear regression models that combines $\ell_1$ and $\ell_2$ norm penalties to facilitate knowledge transfer. In this study, we derive a non-asymptotic $\ell_2$ norm estimation error…

机器学习 · 统计学 2026-03-17 Yui Tomo