机器学习 — Scifaro

Positivity sets of hinge functions

In this paper we investigate which subsets of the real plane are realisable as the set of points on which a one-layer ReLU neural network takes a positive value. In the case of cones we give a full characterisation of such sets.…

机器学习 · 统计学 2025-03-19 Josef Schicho , Ayush Kumar Tewari , Audie Warren

Clustering Items through Bandit Feedback: Finding the Right Feature out of Many

We study the problem of clustering a set of items based on bandit feedback. Each of the $n$ items is characterized by a feature vector, with a possibly large dimension $d$. The items are partitioned into two unknown groups such that items…

机器学习 · 统计学 2025-03-19 Maximilian Graf , Victor Thuot , Nicolas Verzelen

Nystr\"om Kernel Stein Discrepancy

Kernel methods underpin many of the most successful approaches in data science and statistics, and they allow representing probability measures as elements of a reproducing kernel Hilbert space without loss of information. Recently, the…

机器学习 · 统计学 2025-03-19 Florian Kalinke , Zoltan Szabo , Bharath K. Sriperumbudur

A Conditional Independence Test in the Presence of Discretization

Testing conditional independence has many applications, such as in Bayesian network learning and causal discovery. Different test methods have been proposed. However, existing methods generally can not work when only discretized…

机器学习 · 统计学 2025-03-19 Boyang Sun , Yu Yao , Guang-Yuan Hao , Yumou Qiu , Kun Zhang

Semidefinite programming relaxations and debiasing for MAXCUT-based clustering

In this paper, we consider the problem of partitioning a small data sample of size $n$ drawn from a mixture of 2 sub-gaussian distributions in $\R^p$. We consider semidefinite programming relaxations of an integer quadratic program that is…

机器学习 · 统计学 2025-03-19 Shuheng Zhou

Kernel Single Proxy Control for Deterministic Confounding

We consider the problem of causal effect estimation with an unobserved confounder, where we observe a single proxy variable that is associated with the confounder. Although it has been shown that the recovery of an average causal effect is…

机器学习 · 统计学 2025-03-19 Liyuan Xu , Arthur Gretton

Unfair Utilities and First Steps Towards Improving Them

Many fairness criteria constrain the policy or choice of predictors, which can have unwanted consequences, in particular, when optimizing the policy under such constraints. Here, we advocate to instead focus on the utility function the…

机器学习 · 统计学 2025-03-19 Frederik Hytting Jørgensen , Sebastian Weichwald , Jonas Peters

Do you understand epistemic uncertainty? Think again! Rigorous frequentist epistemic uncertainty estimation in regression

Quantifying model uncertainty is critical for understanding prediction reliability, yet distinguishing between aleatoric and epistemic uncertainty remains challenging. We extend recent work from classification to regression to provide a…

机器学习 · 统计学 2025-03-18 Enrico Foglia , Benjamin Bobbia , Nikita Durasov , Michael Bauerheim , Pascal Fua , Stephane Moreau , Thierry Jardin

Edgeworth Expansion for Semi-hard Triplet Loss

We develop a higher-order asymptotic analysis for the semi-hard triplet loss using the Edgeworth expansion. It is known that this loss function enforces that embeddings of similar samples are close while those of dissimilar samples are…

机器学习 · 统计学 2025-03-18 Masanari Kimura

Nonlinear Principal Component Analysis with Random Bernoulli Features for Process Monitoring

The process generates substantial amounts of data with highly complex structures, leading to the development of numerous nonlinear statistical methods. However, most of these methods rely on computations involving large-scale dense kernel…

机器学习 · 统计学 2025-03-18 Ke Chen , Dandan Jiang

Support Collapse of Deep Gaussian Processes with Polynomial Kernels for a Wide Regime of Hyperparameters

We analyze the prior that a Deep Gaussian Process with polynomial kernels induces. We observe that, even for relatively small depths, averaging effects occur within such a Deep Gaussian Process and that the prior can be analyzed and…

机器学习 · 统计学 2025-03-18 Daryna Chernobrovkina , Steffen Grünewälder

Bayes and Biased Estimators Without Hyper-parameter Estimation: Comparable Performance to the Empirical-Bayes-Based Regularized Estimator

Regularized system identification has become a significant complement to more classical system identification. It has been numerically shown that kernel-based regularized estimators often perform better than the maximum likelihood estimator…

机器学习 · 统计学 2025-03-18 Yue Ju , Bo Wahlberg , Håkan Hjalmarsson

Ranking and Selection with Simultaneous Input Data Collection

In this paper, we propose a general and novel formulation of ranking and selection with the existence of streaming input data. The collection of multiple streams of such data may consume different types of resources, and hence can be…

机器学习 · 统计学 2025-03-18 Yuhao Wang , Enlu Zhou

Optimizing Multi-Scale Representations to Detect Effect Heterogeneity Using Earth Observation and Computer Vision: Applications to Two Anti-Poverty RCTs

Earth Observation (EO) data are increasingly used in policy analysis by enabling granular estimation of conditional average treatment effects (CATE). However, a challenge in EO-based causal inference is determining the scale of the input…

机器学习 · 统计学 2025-03-18 Fucheng Warren Zhu , Connor T. Jerzak , Adel Daoud

All or None: Identifiable Linear Properties of Next-token Predictors in Language Modeling

We analyze identifiability as a possible explanation for the ubiquity of linear properties across language models, such as the vector difference between the representations of "easy" and "easiest" being parallel to that between "lucky" and…

机器学习 · 统计学 2025-03-18 Emanuele Marconato , Sébastien Lachapelle , Sebastian Weichwald , Luigi Gresele

Entry-Specific Matrix Estimation under Arbitrary Sampling Patterns through the Lens of Network Flows

Matrix completion tackles the task of predicting missing values in a low-rank matrix based on a sparse set of observed entries. It is often assumed that the observation pattern is generated uniformly at random or has a very specific…

机器学习 · 统计学 2025-03-18 Yudong Chen , Xumei Xi , Christina Lee Yu

Optimal Kernel Quantile Learning with Random Features

The random feature (RF) approach is a well-established and efficient tool for scalable kernel methods, but existing literature has primarily focused on kernel ridge regression with random features (KRR-RF), which has limitations in handling…

机器学习 · 统计学 2025-03-18 Caixing Wang , Xingdong Feng

The Collusion of Memory and Nonlinearity in Stochastic Approximation With Constant Stepsize

In this work, we investigate stochastic approximation (SA) with Markovian data and nonlinear updates under constant stepsize $\alpha>0$. Existing work has primarily focused on either i.i.d. data or linear update rules. We take a new…

机器学习 · 统计学 2025-03-18 Dongyan Huo , Yixuan Zhang , Yudong Chen , Qiaomin Xie

Alpha-divergence loss function for neural density ratio estimation

Density ratio estimation (DRE) is a fundamental machine learning technique for capturing relationships between two probability distributions. State-of-the-art DRE methods estimate the density ratio using neural networks trained with loss…

机器学习 · 统计学 2025-03-18 Yoshiaki Kitazawa

Nystr\"om $M$-Hilbert-Schmidt Independence Criterion

Kernel techniques are among the most popular and powerful approaches of data science. Among the key features that make kernels ubiquitous are (i) the number of domains they have been designed for, (ii) the Hilbert structure of the function…

机器学习 · 统计学 2025-03-18 Florian Kalinke , Zoltán Szabó