English
Related papers

Related papers: Minimum Sample Size for Developing a Multivariable…

200 papers

Background: Clinical prediction models are increasingly used to inform healthcare decisions, but determining the minimum sample size for their development remains a critical and unresolved challenge. Inadequate sample sizes can lead to…

Machine Learning · Computer Science 2026-03-02 Diana Shamsutdinova , Felix Zimmer , Oyebayo Ridwan Olaniran , Sarah Markham , Daniel Stahl , Gordon Forbes , Ewan Carr

Over the last few decades, prediction models have become a fundamental tool in statistics, chemometrics, and related fields. However, to ensure that such models have high value, the inferences that they generate must be reliable. In this…

When developing a clinical prediction model, the sample size of the development dataset is a key consideration. Small sample sizes lead to greater concerns of overfitting, instability, poor performance and lack of fairness. Previous…

Clinical prediction models (CPMs) are used to predict clinically relevant outcomes or events. Typically, prognostic CPMs are derived to predict the risk of a single future outcome. However, with rising emphasis on the prediction of…

Methodology · Statistics 2020-10-29 Glen P. Martin , Matthew Sperrin , Kym I. E. Snell , Iain Buchan , Richard D. Riley

Calibration is a vital aspect of the performance of risk prediction models, but research in the context of ordinal outcomes is scarce. This study compared calibration measures for risk models predicting a discrete ordinal outcome, and…

Bayesian multinomial logistic regression provides a principled, interpretable approach to multiclass classification, but posterior sampling becomes increasingly expensive as the model dimension grows. Prior work has studied scalability in…

Computation · Statistics 2026-02-27 Jared D. Fisher , Kyle R. McEvoy

Objective: Provide guidance on sample size considerations for developing predictive models by empirically establishing the adequate sample size, which balances the competing objectives of improving model performance and reducing model…

Applications · Statistics 2024-07-25 Luis H. John , Jan A. Kors , Jenna M. Reps , Patrick B. Ryan , Peter R. Rijnbeek

In statistics and machine learning, logistic regression is a widely-used supervised learning technique primarily employed for binary classification tasks. When the number of observations greatly exceeds the number of predictor variables, we…

Machine Learning · Statistics 2024-04-02 Agniva Chowdhury , Pradeep Ramuhalli

Logistic regression is a classical model for describing the probabilistic dependence of binary responses to multivariate covariates. We consider the predictive performance of the maximum likelihood estimator (MLE) for logistic regression,…

Statistics Theory · Mathematics 2026-02-20 Hugo Chardon , Matthieu Lerasle , Jaouad Mourtada

When evaluating the performance of a model for individualised risk prediction, the sample size needs to be large enough to precisely estimate the performance measures of interest. Current sample size guidance is based on precisely…

Contemporary sample size calculations for external validation of risk prediction models require users to specify fixed values of assumed model performance metrics alongside target precision levels (e.g., 95% CI widths). However, due to the…

Applications · Statistics 2026-02-13 Mohsen Sadatsafavi , Paul Gustafson , Solmaz Setayeshgar , Laure Wynants , Richard D Riley

We consider a variable selection problem for the prediction of binary outcomes. We study the best subset selection procedure by which the covariates are chosen by maximizing Manski (1975, 1985)'s maximum score objective function subject to…

Methodology · Statistics 2018-05-18 Le-Yu Chen , Sokbae Lee

Let $(Y,X_1,...,X_m)$ be a random vector. It is desired to predict $Y$ based on $(X_1,...,X_m)$. Examples of prediction methods are regression, classification using logistic regression or separating hyperplanes, and so on. We consider the…

Statistics Theory · Mathematics 2007-06-13 Eitan Greenshtein

Binomial data with unknown sizes often appear in biological and medical sciences and are usually overdispersed. All previous methods used parametric models and only considered overdispersion due to the variation of sizes. The proposed…

Statistics Theory · Mathematics 2007-06-13 Wei Zhang

In the realm of contemporary data analysis, the use of massive datasets has taken on heightened significance, albeit often entailing considerable demands on computational time and memory. While a multitude of existing works offer optimal…

Methodology · Statistics 2024-06-21 Tal Agassi , Nir Keret , Malka Gorfine

This paper applies the minimum message length principle to inference of linear regression models with Student-t errors. A new criterion for variable selection and parameter estimation in Student-t regression is proposed. By exploiting…

Methodology · Statistics 2018-02-21 Chi Kuen Wong , Enes Makalic , Daniel F. Schmidt

We investigate a robust penalized logistic regression algorithm based on a minimum distance criterion. Influential outliers are often associated with the explosion of parameter vector estimates, but in the context of standard logistic…

Methodology · Statistics 2014-02-21 Eric C. Chi , David W. Scott

Generalized linear models, such as logistic regression, are widely used to model the association between a treatment and a binary outcome as a function of baseline covariates. However, the coefficients of a logistic regression model…

Methodology · Statistics 2022-01-04 Jiaqi Yin , Sonia Markes , Thomas S. Richardson , Linbo Wang

Although the log-likelihood is widely used in model selection, the log-likelihood ratio has had few applications in this area. We develop a log-likelihood ratio based method for selecting regression models by focusing on the set of models…

Methodology · Statistics 2021-09-28 Min Tsao

Logistic linear mixed model is widely used in experimental designs and genetic analysis with binary traits. Motivated by modern applications, we consider the case with many groups of random effects and each group corresponds to a variance…

Computation · Statistics 2017-11-15 Liuyi Hu , Wenbin Lu , Jin Zhou , Hua Zhou
‹ Prev 1 2 3 10 Next ›