Related papers: Maximum sampled conditional likelihood for informa…

Unweighted estimation based on optimal sample under measurement constraints

To tackle massive data, subsampling is a practical approach to select the more informative data points. However, when responses are expensive to measure, developing efficient subsampling schemes is challenging, and an optimal sampling…

Computation · Statistics 2022-10-11 Jing Wang , HaiYing Wang , Shifeng Xiong

Nearly optimal capture-recapture sampling and empirical likelihood weighting estimation for M-estimation with big data

Subsampling techniques can reduce the computational costs of processing big data. Practical subsampling plans typically involve initial uniform sampling and refined sampling. With a subsample, big data inferences are generally built on the…

Methodology · Statistics 2022-09-13 Yan Fan , Yang Liu , Yukun Liu , Jing Qin

Approximating Partial Likelihood Estimators via Optimal Subsampling

With the growing availability of large-scale biomedical data, it is often time-consuming or infeasible to directly perform traditional statistical analysis with relatively limited computing resources at hand. We propose a fast subsampling…

Methodology · Statistics 2023-05-18 Haixiang Zhang , Lulu Zuo , HaiYing Wang , Liuquan Sun

Optimal Downsampling for Imbalanced Classification with Generalized Linear Models

Downsampling or under-sampling is a technique that is utilized in the context of large and highly imbalanced classification models. We study optimal downsampling for imbalanced classification using generalized linear models (GLMs). We…

Machine Learning · Statistics 2025-05-20 Yan Chen , Jose Blanchet , Krzysztof Dembczynski , Laura Fee Nern , Aaron Flores

Parsimonious and Efficient Likelihood Composition by Gibbs Sampling

The traditional maximum likelihood estimator (MLE) is often of limited use in complex high-dimensional data due to the intractability of the underlying likelihood function. Maximum composite likelihood estimation (McLE) avoids full…

Methodology · Statistics 2015-02-18 Davide Ferrari , Guoqi Qian

Optimal Subsampling Approaches for Large Sample Linear Regression

A significant hurdle for analyzing large sample data is the lack of effective statistical computing and inference methods. An emerging powerful approach for analyzing large sample data is subsampling, by which one takes a random subsample…

Methodology · Statistics 2015-11-24 Rong Zhu , Ping Ma , Michael W. Mahoney , Bin Yu

Optimal subsampling for quantile regression in big data

We investigate optimal subsampling for quantile regression. We derive the asymptotic distribution of a general subsampling estimator and then derive two versions of optimal subsampling probabilities. One version minimizes the trace of the…

Computation · Statistics 2020-01-29 HaiYing Wang , Yanyuan Ma

Optimal subsampling for functional quantile regression

Subsampling is an efficient method to deal with massive data. In this paper, we investigate the optimal subsampling for linear quantile regression when the covariates are functions. The asymptotic distribution of the subsampling estimator…

Numerical Analysis · Mathematics 2022-05-06 Qian Yan , Hanyu Li , Chengmei Niu

Nonuniform Negative Sampling and Log Odds Correction with Rare Events Data

We investigate the issue of parameter estimation with nonuniform negative sampling for imbalanced data. We first prove that, with imbalanced data, the available information about unknown parameters is only tied to the relatively small…

Machine Learning · Statistics 2021-10-26 HaiYing Wang , Aonan Zhang , Chong Wang

Efficiency of the maximum partial likelihood estimator for nested case control sampling

In making inference on the relation between failure and exposure histories in the Cox semiparametric model, the maximum partial likelihood estimator (MPLE) of the finite dimensional odds parameter, and the Breslow estimator of the baseline…

Statistics Theory · Mathematics 2009-06-12 Larry Goldstein , Haimeng Zhang

A Moment-assisted Approach for Improving Subsampling-based MLE with Large-scale data

The maximum likelihood estimation is computationally demanding for large datasets, particularly when the likelihood function includes integrals. Subsampling can reduce the computational burden, but it often results in efficiency loss.This…

Methodology · Statistics 2026-04-27 Miaomiao Su , Qihua Wang , Ruoyu Wang

Semi-supervised learning in unmatched linear regression using an empirical likelihood approach

Knowing the link between observed predictive variables and outcomes is crucial for making inference in any regression model. When this link is missing, partially or completely, classical estimation methods fail in recovering the true…

Statistics Theory · Mathematics 2026-01-28 Fadoua Balabdaoui , Jinyu Chen

A subsampling approach for large data sets when the Generalised Linear Model is potentially misspecified

Subsampling is a computationally efficient and scalable method to draw inference in large data settings based on a subset of the data rather than needing to consider the whole dataset. When employing subsampling techniques, a crucial…

Methodology · Statistics 2025-10-08 Amalan Mahendran , Helen Thompson , James M. McGree

Novel Subsampling Strategies for Heavily Censored Reliability Data

Computational capability often falls short when confronted with massive data, posing a common challenge in establishing a statistical model or statistical inference method dealing with big data. While subsampling techniques have been…

Methodology · Statistics 2024-10-31 Yixiao Ruan , Zan Li , Zhaohui Li , Dennis K. J. Lin , Qingpei Hu , Dan Yu

A Weighted Likelihood Approach Based on Statistical Data Depths

We propose a general approach to construct weighted likelihood estimating equations with the aim of obtaining robust parameter estimates. We modify the standard likelihood equations by incorporating a weight that reflects the statistical…

Statistics Theory · Mathematics 2025-07-24 Claudio Agostinelli , Ayanendranath Basu , Giulia Bertagnolli , Arun Kumar Kuchibhotla

Multilevel maximum likelihood estimation with application to covariance matrices

The asymptotic variance of the maximum likelihood estimate is proved to decrease when the maximization is restricted to a subspace that contains the true parameter value. Maximum likelihood estimation allows a systematic fitting of…

Statistics Theory · Mathematics 2018-01-31 Marie Turčičová , Jan Mandel , Kryštof Eben

Maximum pseudo-likelihood estimation in copula models for small weakly dependent samples

Maximum pseudo-likelihood (MPL) is a semiparametric estimation method often used to obtain the dependence parameters in copula models from data. It has been shown that despite being consistent, and in some cases efficient, MPL estimation…

Methodology · Statistics 2022-09-07 Alexandra Dias

Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators with Massive Data

Nonuniform subsampling methods are effective to reduce computational burden and maintain estimation efficiency for massive data. Existing methods mostly focus on subsampling with replacement due to its high computational efficiency. If the…

Methodology · Statistics 2021-07-06 Jun Yu , HaiYing Wang , Mingyao Ai , Huiming Zhang

Asymptotic properties of the MLE in distributional regression under random censoring

Distributional regression aims to find the best candidate in a given parametric family of conditional distributions to model a given dataset. As each candidate in the distribution family can be identified by the corresponding distribution…

Statistics Theory · Mathematics 2026-05-18 Gitte Kremling , Gerhard Dikta

A Constrained Conditional Likelihood Approach for Estimating the Means of Selected Populations

Given p independent normal populations, we consider the problem of estimating the mean of those populations, that based on the observed data, give the strongest signals. We explicitly condition on the ranking of the sample means, and…

Methodology · Statistics 2017-02-28 Claudio Fuentes , Vik Gopal