Statistics — Scifaro

Dual-Channel Tensor Neural Networks: Finite-Sample Theory and Conformal Structure Selection

Tensor-valued data arise naturally in neuroimaging, genomics, climate science, and spatiotemporal networks, where multilinear dependencies across modes carry information that is destroyed under vectorization. Existing approaches either…

Machine Learning · Statistics 2026-05-20 Elynn Chen , Jiayu Li , Zheshi Zheng , Jian Pei

Learning Interpretable Point-Based Clinical Risk Scores via Direct Optimization

Many clinical risk scores are deployed as additive rules with nonnegative integer points assigned to relevant binary predictive features. These integer weights not only make the score easier to use in practice but also promote sparsity in…

Methodology · Statistics 2026-05-20 Ying Cui , Albert M Li , Vivek Charu , Yeon-Mi Hwang , Tina Hernandez-Boussard , Lu Tian

ldmppr: Location Dependent Marked Point Processes in R

In this article, we present $\textbf{ldmppr}$, an R package for estimating, evaluating, simulating from, and visualizing location-dependent marked spatial point processes. To date, it has commonly been assumed that the marks associated with…

Computation · Statistics 2026-05-20 Lane Drew , Andee Kaplan

Sparse Latent Class Analysis: Post-Estimation Refinement via Item-level Pseudo-Likelihood

Latent Class Analysis (LCA) is widely used to identify unobserved subgroups in social and behavioural sciences. A long-standing challenge for LCA is the interpretability of the latent classes, due to the high complexity of the estimated…

Methodology · Statistics 2026-05-20 Yuxuan Xu , Lea Kaufmann , Yunxiao Chen , Maria Kateri , Irini Moustaki

Conformal Prediction via Transported Beta Laws

Split conformal prediction provides finite-sample marginal coverage under exchangeability, but this guarantee averages over the random calibration sample. We study instead the law of the calibration-conditional coverage induced by a…

Machine Learning · Statistics 2026-05-20 Thiago R. Ramos , Helton Graziadei , Luben M. C. Cabezas

Causal Inference with Categorical Unobserved Confounder via Mixture Learning

Unobserved confounding is a fundamental challenge for estimating causal effects. To address unobserved confounding, recent literature has turned to two different approaches -- proxy variables and the use of multiple treatments. The first…

Methodology · Statistics 2026-05-20 Aytijhya Saha , Stephen Bates , Devavrat Shah

Markov Chain Decoders Overcome the Heavy-Tail Limitations of Lipschitz Generative Models

Heavy-tailed distributions are prevalent in performance evaluation, network traffic, and risk modeling. This behavior poses a fundamental challenge for modern deep generative models. Standard Variational Autoencoders (VAEs) employ Gaussian…

Machine Learning · Statistics 2026-05-20 Abdelhakim Ziani , Andras Horvath , Paolo Ballarini

Bayesian Latent Space Models for Graphs Are Misspecified: Toward Robust Inference via Generalized Posteriors

Bayesian latent space models offer a principled approach to network representation, but rely on correct specification of both geometry and link function. Real-world networks often violate these assumptions, exhibiting geometric mismatch and…

Machine Learning · Statistics 2026-05-20 Aldric Labarthe

A Tutorial on Symbolic Structural Identifiability Analysis of ODE Models in Julia

Structural identifiability analysis determines whether the parameters of a mechanistic ordinary differential equation (ODE) model can be uniquely recovered from ideal observations and is therefore a fundamental prerequisite for reliable…

Methodology · Statistics 2026-05-20 Abdallah Alsammani

Heavy Tails and Predictive Ability Testing

We study the asymptotic behaviour of widely used tests for evaluating and comparing predictive accuracy when forecast errors exhibit heavy tails. In particular, when loss differentials have infinite variance, the Diebold-Mariano test…

Methodology · Statistics 2026-05-20 Jonas F. Frederiksen , Muneya Matsui , Rasmus S. Pedersen

Building a GPU-Accelerated Multivariate Statistics Platform

Classical multivariate statistical methods such as covariance estimation and principal component analysis are well understood mathematically, yet their application at extreme data scales remains challenging. When the number of observations…

Computation · Statistics 2026-05-20 Mike Crowhurst

From design of experiments to analysis of variance of multivariate data: a tutorial review on ANOVA simultaneous component analysis

ANOVA Simultaneous Component Analysis (ASCA) is the current state-of-theart chemometric tool for analyzing and interpreting high-dimensional experimental data from a Design of Experiment (DoE). Being a multivariate extension of the ANOVA,…

Methodology · Statistics 2026-05-20 José Camacho , Jokin Ezenarro , Daniel Schorn-García , Johan A. Westerhuis

Implementation and Workflows for INLA-Based Approximate Bayesian Structural Equation Modelling

Bayesian structural equation modelling (BSEM) offers many advantages such as principled uncertainty quantification, small-sample regularisation, and flexible model specification. However, the Markov chain Monte Carlo (MCMC) methods on which…

Computation · Statistics 2026-05-20 Haziq Jamil , Håvard Rue

Approximate Bayesian Inference for Structural Equation Models using Integrated Nested Laplace Approximations

Markov chain Monte Carlo (MCMC) methods remain the mainstay of Bayesian estimation of structural equation models (SEM), though they often incur a high computational cost. We present a bespoke approximate Bayesian approach to SEM, drawing on…

Methodology · Statistics 2026-05-20 Haziq Jamil , Håvard Rue

Neural Network Models for Contextual Regression

We propose a neural network model for contextual regression in which the regression model depends on contextual features that determine the active submodel and an algorithm to fit the model. The proposed simple contextual neural network…

Machine Learning · Statistics 2026-05-20 Seksan Kiatsupaibul , Pakawan Chansiripas

Bayesian Symbolic Regression for Missing Physics

Model-based approaches for (bio)process systems often suffer from incomplete knowledge of the underlying physical, chemical, or biological laws. Universal differential equations, which embed neural networks within differential equations,…

Machine Learning · Statistics 2026-05-20 Arno Strouwen

TEA-Time: Transporting Effects Across Time

Treatment effects estimated from a randomized controlled trial are local not only to the study population but also to the time at which the trial was conducted. The literature on generalizing experimental findings to new populations is…

Methodology · Statistics 2026-05-20 Harsh Parikh , Gabriel Levin-Konigsberg , Dominique Perrault-Joncas , Alexander Volfovsky

Stochastic Gradient Variational Inference with Price's Gradient Estimator from Bures-Wasserstein to Parameter Space

For approximating a target distribution given only its unnormalized log-density, stochastic gradient-based variational inference (VI) algorithms are a popular approach. For example, Wasserstein VI (WVI) and black-box VI (BBVI) perform…

Machine Learning · Statistics 2026-05-20 Kyurae Kim , Qiang Fu , Yi-An Ma , Jacob R. Gardner , Trevor Campbell

Optimal information deletion and Bayes' theorem

Arnold Zellner published a seminal paper on Bayes' theorem as an optimal information processing rule, a result that led to the variational formulation of Bayes' theorem, and a central idea in generalized variational inference. Almost 40…

Methodology · Statistics 2026-05-20 Hans Montcho , Håvard Rue

Efficient and Minimax Optimal In-context Nonparametric Regression with Transformers

We study in-context learning for nonparametric regression with $\alpha$-H\"older smooth regression functions, for some $\alpha>0$. We prove that, with $n$ in-context examples and $d$-dimensional regression covariates, a pretrained…

Machine Learning · Statistics 2026-05-20 Michelle Ching , Ioana Popescu , Nico Smith , Tianyi Ma , William G. Underwood , Richard J. Samworth