Related papers: Error Scaling Laws for Linear Optimal Estimation f…
We consider the problem of estimating how well a model class is capable of fitting a distribution of labeled data. We show that it is often possible to accurately estimate this "learnability" even when given an amount of data that is too…
Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists…
For the sparse vector model, we consider estimation of the target vector, of its L2-norm and of the noise variance. We construct adaptive estimators and establish the optimal rates of adaptive estimation when adaptation is considered with…
This paper studies the problem of estimation from relative measurements in a graph, in which a vector indexed over the nodes has to be reconstructed from pairwise measurements of differences between its components associated to nodes…
In this paper, we consider a statistical problem of learning a linear model from noisy samples. Existing work has focused on approximating the least squares solution by using leverage-based scores as an importance sampling distribution.…
We study sparse linear regression over a network of agents, modeled as an undirected graph (with no centralized node). The estimation problem is formulated as the minimization of the sum of the local LASSO loss functions plus a quadratic…
Recovery of the sparsity pattern (or support) of an unknown sparse vector from a small number of noisy linear measurements is an important problem in compressed sensing. In this paper, the high-dimensional setting is considered. It is shown…
Label distribution learning (LDL) trains a model to predict the relevance of a set of labels (called label distribution (LD)) to an instance. The previous LDL methods all assumed the LDs of the training instances are accurate. However,…
It is of particular interest to reconstruct or estimate bandlimited graph signals, which are smoothly varying signals defined over graphs, from partial noisy measurements. However, choosing an optimal subset of nodes to sample is NP-hard.…
Large genomic and imaging datasets can be used to train models that learn meaningful representations of cellular systems. Across domains, model performance improves predictably with dataset size and compute budget, providing a basis for…
We develop a framework that we call compressive rate estimation. We assume that the composite channel gain matrix (i.e. the matrix of all channel gains between all network nodes) is compressible which means it can be approximated by a…
Deep graph models (e.g., graph neural networks and graph transformers) have become important techniques for leveraging knowledge across various types of graphs. Yet, the neural scaling laws on graphs, i.e., how the performance of deep graph…
We investigate the estimation of the perimeter of a set by a graph cut of a random geometric graph. For $\Omega \subset D = (0,1)^d$, with $d \geq 2$, we are given $n$ random i.i.d. points on $D$ whose membership in $\Omega$ is known. We…
We consider unbiased estimation of a sparse nonrandom vector corrupted by additive white Gaussian noise. We show that while there are infinitely many unbiased estimators for this problem, none of them has uniformly minimum variance.…
We consider the problem of learning a graph modeling the statistical relations of the $d$ variables from a dataset with $n$ samples $X \in \mathbb{R}^{n \times d}$. Standard approaches amount to searching for a precision matrix $\Theta$…
Graph Neural Networks (GNNs) have shown their great ability in modeling graph structured data. However, real-world graphs usually contain structure noises and have limited labeled nodes. The performance of GNNs would drop significantly when…
This note addresses the question of optimally estimating a linear functional of an object acquired through linear observations corrupted by random noise, where optimality pertains to a worst-case setting tied to a symmetric, convex, and…
We consider the problem of recovering the unknown noise variance in the linear regression model. To estimate the nuisance (a vector of regression coefficients) we use a family of spectral regularisers of the maximum likelihood estimator.…
Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual…
In this paper, we consider the problem of identifying a linear map from measurements which are subject to intermittent and arbitarily large errors. This is a fundamental problem in many estimation-related applications such as fault…