机器学习 — Scifaro

Policy Learning with Competing Agents

Decision makers often aim to learn a treatment assignment policy under a capacity constraint on the number of agents that they can treat. When agents can respond strategically to such policies, competition arises, complicating estimation of…

机器学习 · 统计学 2025-03-31 Roshni Sahoo , Stefan Wager

Nonlinear Multiple Response Regression and Learning of Latent Spaces

Identifying low-dimensional latent structures within high-dimensional data has long been a central topic in the machine learning community, driven by the need for data compression, storage, transmission, and deeper data understanding.…

机器学习 · 统计学 2025-03-28 Ye Tian , Sanyou Wu , Long Feng

Probabilistic Functional Neural Networks

High-dimensional functional time series (HDFTS) are often characterized by nonlinear trends and high spatial dimensions. Such data poses unique challenges for modeling and forecasting due to the nonlinearity, nonstationarity, and high…

机器学习 · 统计学 2025-03-28 Haixu Wang , Jiguo Cao

Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning

Differential privacy (DP) is becoming increasingly important for deployed machine learning applications because it provides strong guarantees for protecting the privacy of individuals whose data is used to train models. However, DP…

机器学习 · 统计学 2025-03-28 Robert Chew , Matthew R. Williams , Elan A. Segarra , Alexander J. Preiss , Amanda Konet , Terrance D. Savitsky

Squared families: Searching beyond regular probability models

We introduce squared families, which are families of probability densities obtained by squaring a linear transformation of a statistic. Squared families are singular, however their singularity can easily be handled so that they form regular…

机器学习 · 统计学 2025-03-28 Russell Tsuchida , Jiawei Liu , Cheng Soon Ong , Dino Sejdinovic

Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models

Fine-tuning Large Language Models (LLMs) on some task-specific datasets has been a primary use of LLMs. However, it has been empirically observed that this approach to enhancing capability inevitably compromises safety, a phenomenon also…

机器学习 · 统计学 2025-03-28 Pin-Yu Chen , Han Shen , Payel Das , Tianyi Chen

Robust Feature Learning for Multi-Index Models in High Dimensions

Recently, there have been numerous studies on feature learning with neural networks, specifically on learning single- and multi-index models where the target is a function of a low-dimensional projection of the input. Prior works have shown…

机器学习 · 统计学 2025-03-28 Alireza Mousavi-Hosseini , Adel Javanmard , Murat A. Erdogdu

Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics

We study the problem of learning multi-index models in high-dimensions using a two-layer neural network trained with the mean-field Langevin algorithm. Under mild distributional assumptions on the data, we characterize the effective…

机器学习 · 统计学 2025-03-28 Alireza Mousavi-Hosseini , Denny Wu , Murat A. Erdogdu

Multinomial belief networks for healthcare data

Healthcare data from patient or population cohorts are often characterized by sparsity, high missingness and relatively small sample sizes. In addition, being able to quantify uncertainty is often important in a medical context. To address…

机器学习 · 统计学 2025-03-28 H. C. Donker , D. Neijzen , J. de Jong , G. A. Lunter

Continual learning via probabilistic exchangeable sequence modelling

Continual learning (CL) refers to the ability to continuously learn and accumulate new knowledge while retaining useful information from past experiences. Although numerous CL methods have been proposed in recent years, it is not…

机器学习 · 统计学 2025-03-27 Hanwen Xing , Christopher Yau

Regression-Based Estimation of Causal Effects in the Presence of Selection Bias and Confounding

We consider the problem of estimating the expected causal effect $E[Y|do(X)]$ for a target variable $Y$ when treatment $X$ is set by intervention, focusing on continuous random variables. In settings without selection bias or confounding,…

机器学习 · 统计学 2025-03-27 Marlies Hafer , Alexander Marx

An $(\epsilon,\delta)$-accurate level set estimation with a stopping criterion

The level set estimation problem seeks to identify regions within a set of candidate points where an unknown and costly to evaluate function's value exceeds a specified threshold, providing an efficient alternative to exhaustive evaluations…

机器学习 · 统计学 2025-03-27 Hideaki Ishibashi , Kota Matsui , Kentaro Kutsukake , Hideitsu Hino

On the Robustness of Kernel Ridge Regression Using the Cauchy Loss Function

Robust regression aims to develop methods for estimating an unknown regression function in the presence of outliers, heavy-tailed distributions, or contaminated data, which can severely impact performance. Most existing theoretical results…

机器学习 · 统计学 2025-03-27 Hongwei Wen , Annika Betken , Wouter Koolen

Valid Conformal Prediction for Dynamic GNNs

Dynamic graphs provide a flexible data abstraction for modelling many sorts of real-world systems, such as transport, trade, and social networks. Graph neural networks (GNNs) are powerful tools allowing for different kinds of prediction and…

机器学习 · 统计学 2025-03-27 Ed Davis , Ian Gallagher , Daniel John Lawson , Patrick Rubin-Delanchy

Which Spatial Partition Trees are Adaptive to Intrinsic Dimension?

Recent theory work has found that a special type of spatial partition tree - called a random projection tree - is adaptive to the intrinsic dimension of the data from which it is built. Here we examine this same question, with a combination…

机器学习 · 统计学 2025-03-27 Nakul Verma , Samory Kpotufe , Sanjoy Dasgupta

Interpretable Deep Regression Models with Interval-Censored Failure Time Data

Deep neural networks (DNNs) have become powerful tools for modeling complex data structures through sequentially integrating simple functions in each hidden layer. In survival analysis, recent advances of DNNs primarily focus on enhancing…

机器学习 · 统计学 2025-03-26 Changhui Yuan , Shishun Zhao , Shuwei Li , Xinyuan Song , Zhao Chen

Causal Bayesian Optimization with Unknown Graphs

Causal Bayesian Optimization (CBO) is a methodology designed to optimize an outcome variable by leveraging known causal relationships through targeted interventions. Traditional CBO methods require a fully and accurately specified causal…

机器学习 · 统计学 2025-03-26 Jean Durand , Yashas Annadani , Stefan Bauer , Sonali Parbhoo

AutoBayes: A Compositional Framework for Generalized Variational Inference

We introduce a new compositional framework for generalized variational inference, clarifying the different parts of a model, how they interact, and how they compose. We explain that both exact Bayesian inference and the loss functions…

机器学习 · 统计学 2025-03-26 Toby St Clere Smithe , Marco Perin

Locally Private Nonparametric Contextual Multi-armed Bandits

Motivated by privacy concerns in sequential decision-making on sensitive data, we address the challenge of nonparametric contextual multi-armed bandits (MAB) under local differential privacy (LDP). We develop a uniform-confidence-bound-type…

机器学习 · 统计学 2025-03-26 Yuheng Ma , Feiyu Jiang , Zifeng Zhao , Hanfang Yang , Yi Yu

Probabilistic Shielding for Safe Reinforcement Learning

In real-life scenarios, a Reinforcement Learning (RL) agent aiming to maximise their reward, must often also behave in a safe manner, including at training time. Thus, much attention in recent years has been given to Safe RL, where an agent…

机器学习 · 统计学 2025-03-26 Edwin Hamel-De le Court , Francesco Belardinelli , Alexander W. Goodall