机器学习 — Scifaro

Modeling Spatio-temporal Extremes via Conditional Variational Autoencoders

Extreme weather events are widely studied in fields such as agriculture, ecology, and meteorology. The spatio-temporal co-occurrence of extreme events can strengthen or weaken under changing climate conditions. In this paper, we propose a…

机器学习 · 统计学 2025-12-09 Xiaoyu Ma , Likun Zhang , Christopher K. Wikle

In-Context Learning Is Provably Bayesian Inference: A Generalization Theory for Meta-Learning

This paper develops a finite-sample statistical theory for in-context learning (ICL), analyzed within a meta-learning framework that accommodates mixtures of diverse task types. We introduce a principled risk decomposition that separates…

机器学习 · 统计学 2025-12-09 Tomoya Wakayama , Taiji Suzuki

Model Monitoring: A General Framework with an Application to Non-life Insurance Pricing

Maintaining the predictive performance of pricing models is challenging when insurance portfolios and data-generating mechanisms evolve over time. Focusing on non-life insurance, we adopt the concept-drift terminology from machine learning…

机器学习 · 统计学 2025-12-09 Alexej Brauer , Paul Menzel , Mario V. Wüthrich

Deep Hedging Under Non-Convexity: Limitations and a Case for AlphaZero

This paper examines replication portfolio construction in incomplete markets - a key problem in financial engineering with applications in pricing, hedging, balance sheet management, and energy storage planning. We model this as a…

机器学习 · 统计学 2025-12-09 Matteo Maggiolo , Giuseppe Nuti , Miroslav Štrupl , Oleg Szehr

Alpha-VI DeepONet: A prior-robust variational Bayesian approach for enhancing DeepONets with uncertainty quantification

We introduce a novel deep operator network (DeepONet) framework that incorporates generalised variational inference (GVI) using R\'enyi's $\alpha$-divergence to learn complex operators while quantifying uncertainty. By incorporating…

机器学习 · 统计学 2025-12-09 Soban Nasir Lone , Subhayan De , Rajdip Nayek

Consequences of Kernel Regularity for Bandit Optimization

In this work we investigate the relationship between kernel regularity and algorithmic performance in the bandit optimization of RKHS functions. While reproducing kernel Hilbert space (RKHS) methods traditionally rely on global kernel…

机器学习 · 统计学 2025-12-08 Madison Lee , Tara Javidi

BalLOT: Balanced $k$-means clustering with optimal transport

We consider the fundamental problem of balanced $k$-means clustering. In particular, we introduce an optimal transport approach to alternating minimization called BalLOT, and we show that it delivers a fast and effective solution to this…

机器学习 · 统计学 2025-12-08 Wenyan Luo , Dustin G. Mixon

Design-marginal calibration of Gaussian process predictive distributions: Bayesian and conformal approaches

We study the calibration of Gaussian process (GP) predictive distributions in the interpolation setting from a design-marginal perspective. Conditioning on the data and averaging over a design measure \mu, we formalize \mu-coverage for…

机器学习 · 统计学 2025-12-08 Aurélien Pion , Emmanuel Vazquez

Do We Really Even Need Data? A Modern Look at Drawing Inference with Predicted Data

As artificial intelligence and machine learning tools become more accessible, and scientists face new obstacles to data collection (e.g., rising costs, declining survey response rates), researchers increasingly use predictions from…

机器学习 · 统计学 2025-12-08 Stephen Salerno , Kentaro Hoffman , Awan Afiaz , Anna Neufeld , Tyler H. McCormick , Jeffrey T. Leek

Symmetric Linear Dynamical Systems are Learnable from Few Observations

We consider the problem of learning the parameters of a $N$-dimensional stochastic linear dynamics under both full and partial observations from a single trajectory of time $T$. We introduce and analyze a new estimator that achieves a small…

机器学习 · 统计学 2025-12-08 Minh Vu , Andrey Y. Lokhov , Marc Vuffray

How to Tame Your LLM: Semantic Collapse in Continuous Systems

We develop a general theory of semantic dynamics for large language models by formalizing them as Continuous State Machines (CSMs): smooth dynamical systems whose latent manifolds evolve under probabilistic transition operators. The…

机器学习 · 统计学 2025-12-08 C. M. Wyss

CausalKANs: interpretable treatment effect estimation with Kolmogorov-Arnold networks

Deep neural networks achieve state-of-the-art performance in estimating heterogeneous treatment effects, but their opacity limits trust and adoption in sensitive domains such as medicine, economics, and public policy. Building on…

机器学习 · 统计学 2025-12-08 Alejandro Almodóvar , Patricia A. Apellániz , Santiago Zazo , Juan Parras

Variational Uncertainty Decomposition for In-Context Learning

As large language models (LLMs) gain popularity in conducting prediction tasks in-context, understanding the sources of uncertainty in in-context learning becomes essential to ensuring reliability. The recent hypothesis of in-context…

机器学习 · 统计学 2025-12-08 I. Shavindra Jayasekera , Jacob Si , Filippo Valdettaro , Wenlong Chen , A. Aldo Faisal , Yingzhen Li

Balancing Performance and Costs in Best Arm Identification

We consider the problem of identifying the best arm in a multi-armed bandit model. Despite a wealth of literature in the traditional fixed budget and fixed confidence regimes of the best arm identification problem, it still remains a…

机器学习 · 统计学 2025-12-08 Michael O. Harding , Kirthevasan Kandasamy

Foundations of Diffusion Models in General State Spaces: A Self-Contained Introduction

Although diffusion models now occupy a central place in generative modeling, introductory treatments commonly assume Euclidean data and seldom clarify their connection to discrete-state analogues. This article is a self-contained primer on…

机器学习 · 统计学 2025-12-05 Vincent Pauline , Tobias Höppe , Kirill Neklyudov , Alexander Tong , Stefan Bauer , Andrea Dittadi

Towards a unified framework for guided diffusion models

Guided or controlled data generation with diffusion models\blfootnote{Partial preliminary results of this work appeared in International Conference on Machine Learning 2025 \citep{li2025provable}.} has become a cornerstone of modern…

机器学习 · 统计学 2025-12-05 Yuchen Jiao , Yuxin Chen , Gen Li

Learning Causality for Longitudinal Data

This thesis develops methods for causal inference and causal representation learning (CRL) in high-dimensional, time-varying data. The first contribution introduces the Causal Dynamic Variational Autoencoder (CDVAE), a model for estimating…

机器学习 · 统计学 2025-12-05 Mouad EL Bouchattaoui

Adaptive Kernel Selection for Stein Variational Gradient Descent

A central challenge in Bayesian inference is efficiently approximating posterior distributions. Stein Variational Gradient Descent (SVGD) is a popular variational inference method which transports a set of particles to approximate a target…

机器学习 · 统计学 2025-12-05 Moritz Melcher , Simon Weissmann , Ashia C. Wilson , Jakob Zech

IndiSeek learns information-guided disentangled representations

Learning disentangled representations is a fundamental task in multi-modal learning. In modern applications such as single-cell multi-omics, both shared and modality-specific features are critical for characterizing cell states and…

机器学习 · 统计学 2025-12-05 Yu Gui , Cong Ma , Zongming Ma

Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models

We study the problem of causal structure learning from a combination of observational and interventional data generated by a linear non-Gaussian structural equation model that might contain cycles. Recent results show that using mere…

机器学习 · 统计学 2025-12-05 Ehsan Sharifian , Saber Salehkaleybar , Negar Kiyavash