机器学习 — Scifaro

Distributional Shrinkage I: Universal Denoiser Beyond Tweedie's Formula

We study the problem of denoising when only the noise level is known, not the noise distribution. Independent noise $Z$ corrupts a signal $X$, yielding the observation $Y = X + \sigma Z$ with known $\sigma \in (0,1)$. We propose…

机器学习 · 统计学 2026-03-30 Tengyuan Liang

Bayesian Optimization on Networks

This paper studies optimization on networks modeled as metric graphs. Motivated by applications where the objective function is expensive to evaluate or only available as a black box, we develop Bayesian optimization algorithms that…

机器学习 · 统计学 2026-03-30 Wenwen Li , Daniel Sanz-Alonso , Ruiyi Yang

Is Supervised Learning Really That Different from Unsupervised?

We demonstrate how supervised learning can be decomposed into a two-stage procedure, where (1) all model parameters are selected in an unsupervised manner, and (2) the outputs y are added to the model, without changing the parameter values.…

机器学习 · 统计学 2026-03-30 Oskar Allerbo , Thomas B. Schön

The Rules-and-Facts Model for Simultaneous Generalization and Memorization in Neural Networks

A key capability of modern neural networks is their capacity to simultaneously learn underlying rules and memorize specific facts or exceptions. Yet, theoretical understanding of this dual capability remains limited. We introduce the…

机器学习 · 统计学 2026-03-27 Gabriele Farné , Fabrizio Boncoraglio , Lenka Zdeborová

Adaptive Subspace Modeling With Functional Tucker Decomposition

Tensors provide a structured representation for multidimensional data, yet discretization can obscure important information when such data originates from continuous processes. We address this limitation by introducing a functional Tucker…

机器学习 · 统计学 2026-03-27 Noah Steidle , Joppe De Jonghe , Mariya Ishteva

Residual-as-Teacher: Mitigating Bias Propagation in Student--Teacher Estimation

We study statistical estimation in a student--teacher setting, where predictions from a pre-trained teacher are used to guide a student model. A standard approach is to train the student to directly match the teacher's outputs, which we…

机器学习 · 统计学 2026-03-27 Kakei Yamamoto , Martin J. Wainwright

A Distribution-to-Distribution Neural Probabilistic Forecasting Framework for Dynamical Systems

Probabilistic forecasting provides a principled framework for uncertainty quantification in dynamical systems by representing predictions as probability distributions rather than deterministic trajectories. However, existing forecasting…

机器学习 · 统计学 2026-03-27 Tianlin Yang , Hailiang Du , Louis Aslett

Practical Efficient Global Optimization is No-regret

Efficient global optimization (EGO) is one of the most widely used noise-free Bayesian optimization algorithms.It comprises the Gaussian process (GP) surrogate model and expected improvement (EI) acquisition function. In practice, when EGO…

机器学习 · 统计学 2026-03-27 Jingyi Wang , Haowei Wang , Nai-Yuan Chiang , Juliane Mueller , Tucker Hartland , Cosmin G. Petra

Fair regression under localized demographic parity constraints

Demographic parity (DP) is a widely used group fairness criterion requiring predictive distributions to be invariant across sensitive groups. While natural in classification, full distributional DP is often overly restrictive in regression…

机器学习 · 统计学 2026-03-27 Arthur Charpentier , Christophe Denis , Romuald Elie , Mohamed Hebiri , François HU

Improving Infinitely Deep Bayesian Neural Networks with Nesterov's Accelerated Gradient Method

As a representative continuous-depth neural network approach, stochastic differential equation (SDE)-based Bayesian neural networks (BNNs) have attracted considerable attention due to their solid theoretical foundations and strong potential…

机器学习 · 统计学 2026-03-27 Chenxu Yu , Wenqi Fang

Mixture-of-Experts under Finite-Rate Gating: Communication--Generalization Trade-offs

Mixture-of-Experts (MoE) architectures decompose prediction tasks into specialized expert sub-networks selected by a gating mechanism. This letter adopts a communication-theoretic view of MoE gating, modeling the gate as a stochastic…

机器学习 · 统计学 2026-03-27 Ali Khalesi , Mohammad Reza Deylam Salehi

Robust Bayesian Inference via Variational Approximations of Generalized Rho-Posteriors

We introduce the $\widetilde{\rho}$-posterior, a modified version of the $\rho$-posterior, obtained by replacing the supremum over competitor parameters with a softmax aggregation. This modification allows a PAC-Bayesian analysis of the…

机器学习 · 统计学 2026-03-27 EL Mahdi Khribch , Pierre Alquier

The Information Dynamics of Generative Diffusion

Generative diffusion models have emerged as a powerful class of models in machine learning, yet a unified theoretical understanding of their operation is still developing. This paper provides an integrated perspective on generative…

机器学习 · 统计学 2026-03-27 Dejan Stancevic , Luca Ambrogioni

Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits

We introduce the first best-of-both-worlds algorithm for contextual combinatorial semi-bandits that simultaneously guarantees $\widetilde{\mathcal{O}}(\sqrt{T})$ regret in the adversarial regime and $\widetilde{\mathcal{O}}(\ln T)$ regret…

机器学习 · 统计学 2026-03-27 Mengmeng Li , Philipp J. Schneider , Jelisaveta Aleksić , Daniel Kuhn

When Models Don't Collapse: On the Consistency of Iterative MLE

The widespread use of generative models has created a feedback loop, in which each generation of models is trained on data partially produced by its predecessors. This process has raised concerns about model collapse: A critical degradation…

机器学习 · 统计学 2026-03-27 Daniel Barzilai , Ohad Shamir

Kernel Density Machines

We introduce kernel density machines (KDM), an agnostic kernel-based framework for learning the Radon-Nikodym derivative (density) between probability measures under minimal assumptions. KDM applies to general measurable spaces and avoids…

机器学习 · 统计学 2026-03-27 Andrea Della Vecchia , Damir Filipovic , Paul Schneider

Exact Risk Curves of signSGD in High-Dimensions: Quantifying Preconditioning and Noise-Compression Effects

In recent years, signSGD has garnered interest as both a practical optimizer as well as a simple model to understand adaptive optimizers like Adam. Though there is a general consensus that signSGD acts to precondition optimization and…

机器学习 · 统计学 2026-03-27 Ke Liang Xiao , Noah Marshall , Atish Agarwala , Elliot Paquette

Trust Region Constrained Bayesian Optimization with Penalized Constraint Handling

Constrained optimization in high-dimensional black-box settings is difficult due to expensive evaluations, the lack of gradient information, and complex feasibility regions. In this work, we propose a Bayesian optimization method that…

机器学习 · 统计学 2026-03-26 Raju Chowdhury , Tanmay Sen , Prajamitra Bhuyan , Biswabrata Pradhan

Continuous-Time Learning of Probability Distributions: A Case Study in a Digital Trial of Young Children with Type 1 Diabetes

Understanding how biomarker distributions evolve over time is a central challenge in digital health and chronic disease monitoring. In diabetes, changes in the distribution of glucose measurements can reveal patterns of disease progression…

机器学习 · 统计学 2026-03-26 Antonio Álvarez-López , Marcos Matabuena

Federated fairness-aware classification under differential privacy

Privacy and algorithmic fairness have become two central issues in modern machine learning. Although each has separately emerged as a rapidly growing research area, their joint effect remains comparatively under-explored. In this paper, we…

机器学习 · 统计学 2026-03-26 Gengyu Xue , Yi Yu