机器学习 — Scifaro

Efficient Approximation to Analytic and $L^p$ functions by Height-Augmented ReLU Networks

This work addresses two fundamental limitations in neural network approximation theory. We demonstrate that a three-dimensional network architecture enables a significantly more efficient representation of sawtooth functions, which serves…

机器学习 · 统计学 2026-03-13 ZeYu Li , FengLei Fan , TieYong Zeng

Co-Diffusion: An Affinity-Aware Two-Stage Latent Diffusion Framework for Generalizable Drug-Target Affinity Prediction

Predicting drug-target affinity is fundamental to virtual screening and lead optimization. However, existing deep models often suffer from representation collapse in stringent cold-start regimes, where the scarcity of labels and domain…

机器学习 · 统计学 2026-03-13 Yining Qian , Pengjie Wang , Yixiao Li , An-Yang Lu , Cheng Tan , Shuang Li , Lijun Liu

Micro-Diffusion Compression - Binary Tree Tweedie Denoising for Online Probability Estimation

We present Midicoth, a lossless compression system that introduces a micro-diffusion denoising layer for improving probability estimates produced by adaptive statistical models. In compressors such as Prediction by Partial Matching (PPM),…

机器学习 · 统计学 2026-03-13 Roberto Tacconelli

Semantics-Aware Caching for Concept Learning

Concept learning is a form of supervised machine learning that operates on knowledge bases in description logics. State-of-the-art concept learners often rely on an iterative search through a countably infinite concept space. In each…

机器学习 · 统计学 2026-03-13 Louis Mozart Kamdem Teyou , Caglar Demir , Axel-Cyrille Ngonga Ngomo

Refereed Learning

We initiate an investigation of learning tasks in a setting where the learner is given access to two competing provers, only one of which is honest. Specifically, we consider the power of such learners in assessing purported properties of…

机器学习 · 统计学 2026-03-13 Ran Canetti , Ephraim Linder , Connor Wagaman

Distribution estimation via Flow Matching with Lipschitz guarantees

Flow Matching, a promising approach in generative modeling, has recently gained popularity. Relying on ordinary differential equations, it offers a simple and flexible alternative to diffusion models, which are currently the…

机器学习 · 统计学 2026-03-13 Lea Kunkel

Weighted Random Dot Product Graphs

Modeling of intricate relational patterns has become a cornerstone of contemporary statistical research and related data science fields. Networks, represented as graphs, offer a natural framework for this analysis. This paper extends the…

机器学习 · 统计学 2026-03-13 Bernardo Marenco , Paola Bermolen , Marcelo Fiori , Federico Larroca , Gonzalo Mateos

Bounds on Representation-Induced Confounding Bias for Treatment Effect Estimation

State-of-the-art methods for conditional average treatment effect (CATE) estimation make widespread use of representation learning. Here, the idea is to reduce the variance of the low-sample CATE estimation by a (potentially constrained)…

机器学习 · 统计学 2026-03-13 Valentyn Melnychuk , Dennis Frauen , Stefan Feuerriegel

ReTabSyn: Realistic Tabular Data Synthesis via Reinforcement Learning

Deep generative models can help with data scarcity and privacy by producing synthetic training data, but they struggle in low-data, imbalanced tabular settings to fully learn the complex data distribution. We argue that striving for the…

机器学习 · 统计学 2026-03-12 Xiaofeng Lin , Seungbae Kim , Zhuoya Li , Zachary DeSoto , Charles Fleming , Guang Cheng

Brenier Isotonic Regression

Isotonic regression (IR) is shape-constrained regression to maintain a univariate fitting curve non-decreasing, which has numerous applications including single-index models and probability calibration. When it comes to multi-output…

机器学习 · 统计学 2026-03-12 Han Bao , Amirreza Eshraghi , Yutong Wang

Adaptive Active Learning for Regression via Reinforcement Learning

Active learning for regression reduces labeling costs by selecting the most informative samples. Improved Greedy Sampling is a prominent method that balances feature-space diversity and output-space uncertainty using a static,…

机器学习 · 统计学 2026-03-12 Simon D. Nguyen , Troy Russo , Kentaro Hoffman , Tyler H. McCormick

On The Complexity of Best-Arm Identification in Non-Stationary Linear Bandits

We study the fixed-budget best-arm identification (BAI) problem in non-stationary linear bandits. Concretely, given a fixed time budget $T\in \mathbb{N}$, finite arm set $\mathcal{X} \subset \mathbb{R}^d$, and a potentially adversarial…

机器学习 · 统计学 2026-03-12 Leo Maynard-Zhang , Zhihan Xiong , Kevin Jamieson , Maryam Fazel

MultiwayPAM: Multiway Partitioning Around Medoids for LLM-as-a-Judge Score Analysis

LLM-as-a-Judge is a flexible framework for text evaluation, which allows us to obtain scores for the quality of a given text from various perspectives by changing the prompt template. Two main challenges in using LLM-as-a-Judge are…

机器学习 · 统计学 2026-03-12 Chihiro Watanabe , Jingyu Sun

A Diffusion Analysis of Policy Gradient for Stochastic Bandits

We study a continuous-time diffusion approximation of policy gradient for $k$-armed stochastic bandits. We prove that with a learning rate $\eta = O(\Delta^2/\log(n))$ the regret is $O(k \log(k) \log(n) / \eta)$ where $n$ is the horizon and…

机器学习 · 统计学 2026-03-12 Tor Lattimore

Stability and Robustness via Regularization: Bandit Inference via Regularized Stochastic Mirror Descent

Statistical inference with bandit data presents fundamental challenges due to adaptive sampling, which violates the independence assumptions underlying classical asymptotic theory. Recent work has identified stability as a sufficient…

机器学习 · 统计学 2026-03-12 Budhaditya Halder , Ishan Sengupta , Koustav Chowdhury , Koulik Khamaru

A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization

In recent years, instructional practices in Operations Research (OR), Management Science (MS), and Analytics have increasingly shifted toward digital environments, where large and diverse groups of learners make it difficult to provide…

机器学习 · 统计学 2026-03-12 Lukas De Kerpel , Arthur Thuy , Dries F. Benoit

Error Analysis of Bayesian Inverse Problems with Generative Priors

Data-driven methods for the solution of inverse problems have become widely popular in recent years thanks to the rise of machine learning techniques. A popular approach concerns the training of a generative model on additional data to…

机器学习 · 统计学 2026-03-12 Bamdad Hosseini , Ziqi Huang

Maximum Risk Minimization with Random Forests

We consider a regression setting where observations are collected in different environments modeled by different data distributions. The field of out-of-distribution (OOD) generalization aims to design methods that generalize better to test…

机器学习 · 统计学 2026-03-12 Francesco Freni , Anya Fries , Linus Kühne , Markus Reichstein , Jonas Peters

Empirical PAC-Bayes Bounds for Markov Chains

The core of generalization theory was developed for independent observations. Some PAC and PAC-Bayes bounds are available for data that exhibit a temporal dependence. However, there are constants in these bounds that depend on properties of…

机器学习 · 统计学 2026-03-12 Vahe Karagulyan , Pierre Alquier

Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand

In this paper, we study the offline sequential feature-based pricing and inventory control problem where the current demand depends on the past demand levels and any demand exceeding the available inventory is lost. Our goal is to leverage…

机器学习 · 统计学 2026-03-12 Korel Gundem , Zhengling Qi