Related papers: Logistic regression geometry
Logistic regression is an important statistical tool for assessing the probability of an outcome based upon some predictive variables. Standard methods can only deal with precisely known data, however many datasets have uncertainties which…
The logistic loss function is often advocated in machine learning and statistics as a smooth and strictly convex surrogate for the 0-1 loss. In this paper we investigate the question of whether these smoothness and convexity properties make…
Case-control sampling is a commonly used retrospective sampling design to alleviate imbalanced structure of binary data. When fitting the logistic regression model with case-control data, although the slope parameter of the model can be…
This paper studies the estimation and inference for the isotonic regression at the boundary point, an object that is particularly interesting and required in the analysis of monotone regression discontinuity designs. We show that the…
We develop a uniform inference theory for high-dimensional slope parameters in threshold regression models, allowing for either cross-sectional or time series data. We first establish oracle inequalities for prediction errors, and L1…
In this paper we extend the work of Owen (2007) by deriving a second order expansion for the slope parameter in logistic regression, when the size of the majority class is unbounded and the minority class is finite. More precisely, we…
Protesting mildly against the notion of an exactly correct parametric model the view is adopted that the logistic regression equation is merely an approximation to the underlying, true function. The behaviour of likelihood based estimators…
In causal inference, interference occurs when the treatment of one unit may affect the outcomes of other units. The goal of this work is to serve as a guide to the use of linear outcome modeling for estimating causal effects in settings…
We consider statistical inference in high-dimensional regression problems under affine constraints on the parameter space. The theoretical study of this is motivated by the study of genetic determinants of diseases, such as diabetes, using…
Logistic regression is the most commonly used method for constructing predictive models for binary responses. One significant drawback to this approach, however, is that the asymptotes of the logistic response function are fixed at 0 and 1,…
In overparameterized logistic regression, gradient descent (GD) iterates diverge in norm while converging in direction to the maximum $\ell_2$-margin solution -- a phenomenon known as the implicit bias of GD. This work investigates…
Consider a nonparametric regression model with one-sided errors and regression function in a general H\"older class. We estimate the regression function via minimization of the local integral of a polynomial approximation. We show uniform…
We provide finite-sample distribution approximations, that are uniform in the parameter, for inference in linear mixed models. Focus is on variances and covariances of random effects in cases where existing theory fails because their…
We study nonparametric distance-based (isotropic) local polynomial methods for estimating the boundary average treatment effect curve, a causal functional that captures treatment effect heterogeneity in boundary discontinuity designs. We…
When the difference between treatments in a clinical trial is estimated by a difference in means, then it is well known that randomization ensures unbiassed estimation, even if no account is taken of important baseline covariates. However,…
Statistical inference on the explained variation of an outcome by a set of covariates is of particular interest in practice. When the covariates are of moderate to high-dimension and the effects are not sparse, several approaches have been…
Logistic regression models are a popular and effective method to predict the probability of categorical response data. However inference for these models can become computationally prohibitive for large datasets. Here we adapt ideas from…
Logistic regression is a well-known statistical model which is commonly used in the situation where the output is a binary random variable. It has a wide range of applications including machine learning, public health, social sciences,…
For the last two decades, high-dimensional data and methods have proliferated throughout the literature. Yet, the classical technique of linear regression has not lost its usefulness in applications. In fact, many high-dimensional…
Latent space models are powerful statistical tools for modeling and understanding network data. While the importance of accounting for uncertainty in network analysis has been well recognized, the current literature predominantly focuses on…