机器学习 — Scifaro

On Stability and Decomposition of Sample Quantiles under Heavy-Tailed Distributions

We study sample quantiles of distributions indexed by estimated parameters, with a on Value-at-Risk related to linear projections of financial returns that whose underlying probability law is heavy-tailed. In this setting, the projection…

机器学习 · 统计学 2026-05-25 Choudur Lakshminarayan

Feature Learning in Linear-Width Two-Layer Networks: Two vs. One Step of Gradient Descent

We study feature learning in two-layer neural networks within the linear-width regime, where the number of hidden neurons, sample size, and input dimension scale proportionally. While recent work has analyzed feature learning via a single…

机器学习 · 统计学 2026-05-25 Behrad Moniri , Hamed Hassani

Order-Optimal Sequential 1-Bit Mean Estimation in General Tail Regimes

In this paper, we study the problem of mean estimation under 1-bit communication constraints. We propose a novel adaptive mean estimator based solely on randomized threshold queries, where each 1-bit outcome indicates whether a given sample…

机器学习 · 统计学 2026-05-25 Ivan Lau , Jonathan Scarlett

Linear Regression with Unknown Truncation Beyond Gaussian Features

In truncated linear regression, samples $(x,y)$ are shown only when the outcome $y$ falls inside a certain survival set $S^\star$ and the goal is to estimate the unknown $d$-dimensional regressor $w^\star$. This problem has a long history…

机器学习 · 统计学 2026-05-25 Alexandros Kouridakis , Anay Mehrotra , Alkis Kalavasis , Constantine Caramanis

Online monotone density estimation and log-optimal calibration

We study the problem of online monotone density estimation, where density estimators must be constructed in a predictable manner from sequentially observed data. We propose two online estimators: an online analogue of the classical…

机器学习 · 统计学 2026-05-25 Rohan Hore , Ruodu Wang , Aaditya Ramdas

Amortized Simulation-Based Inference in Generalized Bayes via Neural Posterior Estimation

Generalized Bayesian Inference (GBI) tempers a loss with a temperature $\beta > 0$ to mitigate overconfidence and improve robustness under model misspecification, but existing GBI methods typically rely on costly MCMC or SDE-based samplers…

机器学习 · 统计学 2026-05-25 Shiyi Sun , Geoff K. Nicholls , Jeong Eun Lee

Online Partitioned Local Depth for semi-supervised applications

We introduce an extension of the partitioned local depth (PaLD) algorithm that is adapted to online applications such as semi-supervised prediction. PaLD is best known for unsupervised, parameter-free clustering, but its robustness is based…

机器学习 · 统计学 2026-05-25 John D. Foley , Justin T. Lee

Decomposition-Based Modular Conformal Prediction for Two-Stage Modeling

Conformal prediction offers finite-sample coverage guarantees under minimal assumptions. However, existing methods treat the entire modeling process as a black box, overlooking opportunities to exploit and understand modular structure. We…

机器学习 · 统计学 2026-05-25 William Zhang , Saurabh Amin , Georgia Perakis

Vecchia-Inducing-Points Full-Scale Approximations for Gaussian Processes

Gaussian processes are flexible, probabilistic, non-parametric models widely used in machine learning and statistics. However, their scalability to large data sets is limited by computational constraints. To overcome these challenges, we…

机器学习 · 统计学 2026-05-25 Tim Gyger , Reinhard Furrer , Fabio Sigrist

A Tale of Two Cities: Pessimism and Opportunism in Offline Dynamic Pricing

We study offline dynamic pricing when historical data provide incomplete coverage of the price space such that some candidate prices, including the optimal one, may be entirely unobserved. This setting is common in practice and is…

机器学习 · 统计学 2026-05-25 Zeyu Bian , Lan Wang , Zhengling Qi

A Martingale Kernel Independence Test

The Hilbert-Schmidt Independence Criterion (HSIC) and its joint-independence extension $d\mathrm{HSIC}$ are degenerate $V$-statistics whose data-dependent weighted-$\chi^2$ null limits force a permutation calibration that multiplies the…

机器学习 · 统计学 2026-05-22 Felix Laumann , Zhaolu Liu , Mauricio Barahona

Do Not Trust The Auctioneer: Learning to Bid in Feedback-Manipulated Auctions

Shilling is the use of artificial bids to make competition appear stronger and push prices upward. We study repeated first-price auctions in which shilling affects feedback but not allocation: the learner wins or loses against the real…

机器学习 · 统计学 2026-05-22 Luigi Foscari , Matilde Tullii , Vianney Perchet

Departure from Regularity: Degree Heterogeneity and Eigengap as the Structural Drivers of ASE-LSE Latent Subspace Disagreement

Two of the most widely used methods for analysing graph data, Adjacency Spectral Embedding and Laplacian Spectral Embedding, often produce different results when applied to the same network. Yet the structural reasons behind this…

机器学习 · 统计学 2026-05-22 Minh Triet Pham , Ian Gallagher

From Betting to Empirical Bernstein LIL

This is a verbatim copy of a technical report I wrote in 2017-2018 to obtain the law of the iterated logarithm using the guarantee on the wealth of an online betting strategy.

机器学习 · 统计学 2026-05-22 Francesco Orabona

Uniform-in-Time Weak Propagation-of-Chaos in Shallow Neural Networks

We consider one-hidden layer neural networks trained in the feature-learning regime using gradient descent, and relate the output of the finite-width network $f_{\hat{\rho}_t^m}$ to its infinite-width counterpart $f_{\rho_t^{MF}}$, which…

机器学习 · 统计学 2026-05-22 Margalit Glasgow , Joan Bruna

Support-aware offline policy selection for advertising marketplaces

Logged advertising auctions make offline reserve-price evaluation attractive but risky. Replay tables can identify policies with large apparent yield gains, yet they can also hide weak threshold support, multiple-comparison effects,…

机器学习 · 统计学 2026-05-22 Prashant Shekhar , Caroline Howard

Scalable On-Policy Reinforcement Learning via Adaptive Batch Scaling

Conventional wisdom holds that large-batch training is fundamentally incompatible with Reinforcement Learning (RL) - beyond a modest threshold, increasing batch sizes typically yields diminishing returns or performance degradation due to…

机器学习 · 统计学 2026-05-22 Jongchan Park

Local Covariate Selection for Average Causal Effect Estimation without Pretreatment and Causal Sufficiency Assumptions

We study the problem of selecting covariates for unbiased estimation of the total causal effect.Existing approaches typically rely on global causal structure learning over all variables, or on strong assumptions such as causal sufficiency -…

机器学习 · 统计学 2026-05-22 Zeyu Liu , Zheng Li , Feng Xie , Yan Zeng , Hao Zhang , Kun Zhang

Adaptive RBF-KAN: A Comparative Evaluation of Dynamic Shape Parameters in Kolmogorov-Arnold Networks

Kolmogorov-Arnold Networks (KANs) approximate multivariate functions using learnable univariate edge functions, typically parameterized by B-spline bases. Although effective, spline-based implementations can be computationally expensive. A…

机器学习 · 统计学 2026-05-22 Roberto Cavoretto , Alessandra De Rossi , Adeeba Haider , Amir Noorizadegan

Information Processing Capacity of Stationary Physical Systems: Theory, Data-efficient Estimation Methods, and Photonic Demonstration

Physical computing systems provide a promising route toward hardware-native machine learning, but their computational capabilities remain difficult to characterize in a principled, task-independent, and data-efficient way. We extend the…

机器学习 · 统计学 2026-05-22 Rahul Uma Ramachandran , Serge Massar