Related papers: Valid Post-Selection and Post-Regularization Infer…
This paper studies a regularized support function estimator for bounds on components of the parameter vector in the case in which the identified set is a polygon. The proposed regularized estimator has three important properties: (i) it has…
In this paper, we consider the $\alpha\| \cdot\|_{\ell_1}-\beta\| \cdot\|_{\ell_2}$ sparsity regularization with parameter $\alpha\geq\beta\geq0$ for nonlinear ill-posed inverse problems. We investigate the well-posedness of the…
Over the past few years, trace regression models have received considerable attention in the context of matrix completion, quantum state tomography, and compressed sensing. Estimation of the underlying matrix from regularization-based…
This paper considers generalized linear models in the presence of many controls. We lay out a general methodology to estimate an effect of interest based on the construction of an instrument that immunize against model selection mistakes…
We consider the statistical inverse problem to recover $f$ from noisy measurements $Y = Tf + \sigma \xi$ where $\xi$ is Gaussian white noise and $T$ a compact operator between Hilbert spaces. Considering general reconstruction methods of…
This paper studies least-square regression penalized with partly smooth convex regularizers. This class of functions is very large and versatile allowing to promote solutions conforming to some notion of low-complexity. Indeed, they force…
For a convex class of functions $F$, a regularization functions $\Psi(\cdot)$ and given the random data $(X_i, Y_i)_{i=1}^N$, we study estimation properties of regularization procedures of the form \begin{equation*} \hat f \in {\rm…
Understanding treatment effect heterogeneity is important for decision making in medical and clinical practices, or handling various engineering and marketing challenges. When dealing with high-dimensional covariates or when the effect…
Current methods for regularization in machine learning require quite specific model assumptions (e.g. a kernel shape) that are not derived from prior knowledge about the application, but must be imposed merely to make the method work. We…
We describe a hierarchical Bayesian approach for inference about a parameter $\theta$ lower-bounded by $\alpha$ with uncertain $\alpha$, derive some basic identities for posterior analysis about $(\theta,\alpha)$, and provide illustrations…
We propose a likelihood ratio based inferential framework for high dimensional semiparametric generalized linear models. This framework addresses a variety of challenging problems in high dimensional data analysis, including incomplete…
We consider a minimization problem whose objective function is the sum of a fidelity term, not necessarily convex, and a regularization term defined by a positive regularization parameter $\lambda$ multiple of the $\ell_0$ norm composed…
In this paper, a general class of regularized $M$-estimators of scatter matrix are proposed which are suitable also for low or insufficient sample support (small $n$ and large $p$) problems. The considered class constitutes a natural…
High-dimensional statistical inference deals with models in which the the number of parameters p is comparable to or larger than the sample size n. Since it is usually impossible to obtain consistent procedures unless $p/n\rightarrow0$, a…
Detecting influential features in non-linear and/or high-dimensional data is a challenging and increasingly important task in machine learning. Variable selection methods have thus been gaining much attention as well as post-selection…
This paper investigates correct variable selection in finite samples via $\ell_1$ and $\ell_1+\ell_2$ type penalization schemes. The asymptotic consistency of variable selection immediately follows from this analysis. We focus on logistic…
This paper presents a regularization technique incorporating a non-convex and non-smooth term, $\ell_{1}^{2}-\eta\ell_{2}^{2}$, with parameters $0<\eta\leq 1$ designed to address ill-posed linear problems that yield sparse solutions. We…
Data selection methods address a critical challenge in LLM post-training: effectively leveraging scarce, high-fidelity target data alongside abundant but imperfectly aligned general training data. In this work, we move beyond the…
Inference after model selection has been an active research topic in the past few years, with numerous works offering different approaches to addressing the perils of the reuse of data. In particular, major progress has been made recently…
Artificial intelligence (AI) and machine learning (ML) are increasingly used to generate data for downstream analyses, yet naively treating these predictions as true observations can lead to biased results and incorrect inference. Wang et…