机器学习 — Scifaro

Adaptively-weighted Nearest Neighbors for Matrix Completion

In this technical note, we introduce and analyze AWNN: an adaptively weighted nearest neighbor method for performing matrix completion. Nearest neighbor (NN) methods are widely used in missing data problems across multiple disciplines such…

机器学习 · 统计学 2025-05-15 Tathagata Sadhukhan , Manit Paul , Raaz Dwivedi

Deep-SITAR: A SITAR-Based Deep Learning Framework for Growth Curve Modeling via Autoencoders

Several approaches have been developed to capture the complexity and nonlinearity of human growth. One widely used is the Super Imposition by Translation and Rotation (SITAR) model, which has become popular in studies of adolescent growth.…

机器学习 · 统计学 2025-05-15 María Alejandra Hernández , Oscar Rodriguez , Dae-Jin Lee

Fairness-aware Bayes optimal functional classification

Algorithmic fairness has become a central topic in machine learning, and mitigating disparities across different subpopulations has emerged as a rapidly growing research area. In this paper, we systematically study the classification of…

机器学习 · 统计学 2025-05-15 Xiaoyu Hu , Gengyu Xue , Zhenhua Lin , Yi Yu

Optimal Transport-Based Domain Adaptation for Rotated Linear Regression

Optimal Transport (OT) has proven effective for domain adaptation (DA) by aligning distributions across domains with differing statistical properties. Building on the approach of Courty et al. (2016), who mapped source data to the target…

机器学习 · 统计学 2025-05-15 Brian Britos , Mathias Bourel

Online Learning of Neural Networks

We study online learning of feedforward neural networks with the sign activation function that implement functions from the unit ball in $\mathbb{R}^d$ to a finite label set $\{1, \ldots, Y\}$. First, we characterize a margin condition that…

机器学习 · 统计学 2025-05-15 Amit Daniely , Idan Mehalel , Elchanan Mossel

Lower Bounds on the MMSE of Adversarially Inferring Sensitive Features

We propose an adversarial evaluation framework for sensitive feature inference based on minimum mean-squared error (MMSE) estimation with a finite sample size and linear predictive models. Our approach establishes theoretical lower bounds…

机器学习 · 统计学 2025-05-15 Monica Welfert , Nathan Stromberg , Mario Diaz , Lalitha Sankar

Introduction to Machine Learning

This book introduces the mathematical foundations and techniques that lead to the development and analysis of many of the algorithms that are used in machine learning. It starts with an introductory chapter that describes notation used…

机器学习 · 统计学 2025-05-15 Laurent Younes

Efficient Prior Calibration From Indirect Data

Bayesian inversion is central to the quantification of uncertainty within problems arising from numerous applications in science and engineering. To formulate the approach, four ingredients are required: a forward model mapping the unknown…

机器学习 · 统计学 2025-05-15 O. Deniz Akyildiz , Mark Girolami , Andrew M. Stuart , Arnaud Vadeboncoeur

Properties of Discrete Sliced Wasserstein Losses

The Sliced Wasserstein (SW) distance has become a popular alternative to the Wasserstein distance for comparing probability measures. Widespread applications include image processing, domain adaptation and generative modelling, where it is…

机器学习 · 统计学 2025-05-15 Eloi Tanguy , Rémi Flamary , Julie Delon

PCS-UQ: Uncertainty Quantification via the Predictability-Computability-Stability Framework

As machine learning (ML) models are increasingly deployed in high-stakes domains, trustworthy uncertainty quantification (UQ) is critical for ensuring the safety and reliability of these models. Traditional UQ methods rely on specifying a…

机器学习 · 统计学 2025-05-14 Abhineet Agarwal , Michael Xiao , Rebecca Barter , Omer Ronen , Boyu Fan , Bin Yu

neuralGAM: An R Package for Fitting Generalized Additive Neural Networks

Nowadays, Neural Networks are considered one of the most effective methods for various tasks such as anomaly detection, computer-aided disease detection, or natural language processing. However, these networks suffer from the ``black-box''…

机器学习 · 统计学 2025-05-14 Ines Ortega-Fernandez , Marta Sestelo

Learning Treatment Allocations with Risk Control Under Partial Identifiability

Learning beneficial treatment allocations for a patient population is an important problem in precision medicine. Many treatments come with adverse side effects that are not commensurable with their potential benefits. Patients who do not…

机器学习 · 统计学 2025-05-14 Sofia Ek , Dave Zachariah

Diffusion-based supervised learning of generative models for efficient sampling of multimodal distributions

We propose a hybrid generative model for efficient sampling of high-dimensional, multimodal probability distributions for Bayesian inference. Traditional Monte Carlo methods, such as the Metropolis-Hastings and Langevin Monte Carlo sampling…

机器学习 · 统计学 2025-05-14 Hoang Tran , Zezhong Zhang , Feng Bao , Dan Lu , Guannan Zhang

The Double Descent Behavior in Two Layer Neural Network for Binary Classification

Recent studies observed a surprising concept on model test error called the double descent phenomenon, where the increasing model complexity decreases the test error first and then the error increases and decreases again. To observe this,…

机器学习 · 统计学 2025-05-14 Chathurika S Abeykoon , Aleksandr Beknazaryan , Hailin Sang

A Finite Sample Analysis of Distributional TD Learning with Linear Function Approximation

In this paper, we study the finite-sample statistical rates of distributional temporal difference (TD) learning with linear function approximation. The aim of distributional TD learning is to estimate the return distribution of a discounted…

机器学习 · 统计学 2025-05-14 Yang Peng , Kaicheng Jin , Liangyu Zhang , Zhihua Zhang

Certified Data Removal Under High-dimensional Settings

Machine unlearning focuses on the computationally efficient removal of specific training data from trained models, ensuring that the influence of forgotten data is effectively eliminated without the need for full retraining. Despite…

机器学习 · 统计学 2025-05-13 Haolin Zou , Arnab Auddy , Yongchan Kwon , Kamiar Rahnama Rad , Arian Maleki

Adaptive, Robust and Scalable Bayesian Filtering for Online Learning

In this thesis, we introduce Bayesian filtering as a principled framework for tackling diverse sequential machine learning problems, including online (continual) learning, prequential (one-step-ahead) forecasting, and contextual bandits. To…

机器学习 · 统计学 2025-05-13 Gerardo Duran-Martin

A Sparse Bayesian Learning Algorithm for Estimation of Interaction Kernels in Motsch-Tadmor Model

In this paper, we investigate the data-driven identification of asymmetric interaction kernels in the Motsch-Tadmor model based on observed trajectory data. The model under consideration is governed by a class of semilinear evolution…

机器学习 · 统计学 2025-05-13 Jinchao Feng , Sui Tang

Learning curves theory for hierarchically compositional data with power-law distributed features

Recent theories suggest that Neural Scaling Laws arise whenever the task is linearly decomposed into power-law distributed units. Alternatively, scaling laws also emerge when data exhibit a hierarchically compositional structure, as is…

机器学习 · 统计学 2025-05-13 Francesco Cagnetta , Hyunmo Kang , Matthieu Wyart

Reverse-BSDE Monte Carlo

Recently, there has been a growing interest in generative models based on diffusions driven by the empirical robustness of these methods in generating high-dimensional photorealistic images and the possibility of using the vast existing…

机器学习 · 统计学 2025-05-13 Jairon H. N. Batista , Flávio B. Gonçalves , Yuri F. Saporito , Rodrigo S. Targino