Related papers: Approximating Partial Likelihood Estimators via Op…
The use of massive survival data has become common in survival analysis. In this study, a subsampling algorithm is proposed for the Cox proportional hazards model with time-dependent covariates when the sample is extraordinarily large but…
We investigate optimal subsampling for quantile regression. We derive the asymptotic distribution of a general subsampling estimator and then derive two versions of optimal subsampling probabilities. One version minimizes the trace of the…
The Cox proportional hazards model is widely used in survival analysis to model time-to-event data. However, it faces significant computational challenges in the era of large-scale data, particularly when dealing with time-dependent…
For massive data, the family of subsampling algorithms is popular to downsize the data volume and reduce computational burden. Existing studies focus on approximating the ordinary least squares estimate in linear regression, where…
To ensure privacy protection and alleviate computational burden, we propose a fast subsmaling procedure for the Cox model with massive survival datasets from multi-centered, decentralized sources. The proposed estimator is computed based on…
Nonuniform subsampling methods are effective to reduce computational burden and maintain estimation efficiency for massive data. Existing methods mostly focus on subsampling with replacement due to its high computational efficiency. If the…
To fast approximate maximum likelihood estimators with massive data, this paper studies the Optimal Subsampling Method under the A-optimality Criterion (OSMAC) for generalized linear models. The consistency and asymptotic normality of the…
Big data is ubiquitous in practices, and it has also led to heavy computation burden. To reduce the calculation cost and ensure the effectiveness of parameter estimators, an optimal subset sampling method is proposed to estimate the…
Massive sized survival datasets are becoming increasingly prevalent with the development of the healthcare industry. Such datasets pose computational challenges unprecedented in traditional survival analysis use-cases. A popular way for…
A significant hurdle for analyzing large sample data is the lack of effective statistical computing and inference methods. An emerging powerful approach for analyzing large sample data is subsampling, by which one takes a random subsample…
Computational capability often falls short when confronted with massive data, posing a common challenge in establishing a statistical model or statistical inference method dealing with big data. While subsampling techniques have been…
Inference on the parametric part of a semiparametric model is no trivial task. If one approximates the infinite dimensional part of the semiparametric model by a parametric function, one obtains a parametric model that is in some sense…
The change-plane Cox model is a popular tool for the subgroup analysis of survival data. Despite the rich literature on this model, there has been limited investigation into the asymptotic properties of the estimators of the…
Subsampling is an efficient method to deal with massive data. In this paper, we investigate the optimal subsampling for linear quantile regression when the covariates are functions. The asymptotic distribution of the subsampling estimator…
For massive data stored at multiple machines, we propose a distributed subsampling procedure for the composite quantile regression. By establishing the consistency and asymptotic normality of the composite quantile regression estimator from…
The maximum likelihood estimation is computationally demanding for large datasets, particularly when the likelihood function includes integrals. Subsampling can reduce the computational burden, but it often results in efficiency loss.This…
In this paper, we propose improved estimation method for logistic regression based on subsamples taken according the optimal subsampling probabilities developed in Wang et al. 2018 Both asymptotic results and numerical results show that the…
Subsampling is an effective approach to alleviate the computational burden associated with large-scale datasets. Nevertheless, existing subsampling estimators incur a substantial loss in estimation efficiency compared to estimators based on…
Large sample size brings the computation bottleneck for modern data analysis. Subsampling is one of efficient strategies to handle this problem. In previous studies, researchers make more fo- cus on subsampling with replacement (SSR) than…
Thomas' partial likelihood estimator of regression parameters is widely used in the analysis of nested case-control data with Cox's model. This paper proposes a new estimator of the regression parameters, which is consistent and…