English
Related papers

Related papers: Variable Selection with Scalable Bootstrap in Gene…

200 papers

The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving large datasets---which are increasingly prevalent---the computation of bootstrap-based quantities can be prohibitively…

Methodology · Statistics 2012-06-29 Ariel Kleiner , Ameet Talwalkar , Purnamrita Sarkar , Michael I. Jordan

The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving large datasets, the computation of bootstrap-based quantities can be prohibitively demanding. As an alternative, we…

Machine Learning · Computer Science 2012-07-03 Ariel Kleiner , Ameet Talwalkar , Purnamrita Sarkar , Michael Jordan

Massive data analysis becomes increasingly prevalent, subsampling methods like BLB (Bag of Little Bootstraps) serves as powerful tools for assessing the quality of estimators for massive data. However, the performance of the subsampling…

Methodology · Statistics 2022-01-14 Yingying Ma , Hansheng Wang

In this paper we address the problem of performing statistical inference for large scale data sets i.e., Big Data. The volume and dimensionality of the data may be so high that it cannot be processed or stored in a single computing node. We…

Methodology · Statistics 2016-04-20 Shahab Basiri , Esa Ollila , Visa Koivunen

We propose a computationally intensive method, the random lasso method, for variable selection in linear models. The method consists of two major steps. In step 1, the lasso method is applied to many bootstrap samples, each using a set of…

Applications · Statistics 2011-04-19 Sijian Wang , Bin Nan , Saharon Rosset , Ji Zhu

The bootstrap is a popular and powerful method for assessing precision of estimators and inferential methods. However, for massive datasets which are increasingly prevalent, the bootstrap becomes prohibitively costly in computation and its…

Methodology · Statistics 2015-08-06 Srijan Sengupta , Stanislav Volgushev , Xiaofeng Shao

We introduce varbvs, a suite of functions written in R and MATLAB for regression analysis of large-scale data sets using Bayesian variable selection methods. We have developed numerical optimization algorithms based on variational…

Computation · Statistics 2017-09-21 Peter Carbonetto , Xiang Zhou , Matthew Stephens

Statistical multispecies models of multiarea marine ecosystems use a variety of data sources to estimate parameters using composite or weighted likelihood functions with associated weighting issues and questions on how to obtain variance…

Applications · Statistics 2012-02-16 Lorna Taylor , Verena M. Trenkel , Vojtech Kupca , Gunnar Stefansson

In many practices, scientists are particularly interested in detecting which of the predictors are truly associated with a multivariate response. It is more accurate to model multiple responses as one vector rather than separating each…

Methodology · Statistics 2021-11-16 Xiaotian Dai , Guifang Fu , Randall Reese , Shaofei Zhao , Zuofeng Shang

The bootstrap is a widely used procedure for statistical inference because of its simplicity and attractive statistical properties. However, the vanilla version of bootstrap is no longer feasible computationally for many modern massive…

Methodology · Statistics 2023-02-16 Yingying Ma , Chenlei Leng , Hansheng Wang

Bootstrap methods have long been the cornerstone of ensemble learning in machine learning. This paper presents a theoretical analysis of bootstrap techniques applied to the Least Square Support Vector Machine (LSSVM) ensemble in the context…

Variational inference is a general approach for approximating complex density functions, such as those arising in latent variable models, popular in machine learning. It has been applied to approximate the maximum likelihood estimator and…

Methodology · Statistics 2018-04-19 Yen-Chi Chen , Y. Samuel Wang , Elena A. Erosheva

Methods based on partial least squares (PLS) regression, which has recently gained much attention in the analysis of high-dimensional genomic datasets, have been developed since the early 2000s for performing variable selection. Most of…

Methodology · Statistics 2021-08-31 Jérémy Magnanensi , Myriam Maumy-Bertrand , Nicolas Meyer , Frédéric Bertrand

The partially linear binary choice model can be used for estimating structural equations where nonlinearity may appear due to diminishing marginal returns, different life cycle regimes, or hectic physical phenomena. The inference procedure…

Econometrics · Economics 2023-12-01 Wenzheng Gao , Zhenting Sun

Variational Bayes (VB), a method originating from machine learning, enables fast and scalable estimation of complex probabilistic models. Thus far, applications of VB in discrete choice analysis have been limited to mixed logit models with…

Methodology · Statistics 2020-01-17 Rico Krueger , Prateek Bansal , Michel Bierlaire , Ricardo A. Daziano , Taha H. Rashidi

Multiple systems estimation using a Poisson loglinear model is a standard approach to quantifying hidden populations where data sources are based on lists of known cases. Information criteria are often used for selecting between the large…

Methodology · Statistics 2023-11-23 Bernard W. Silverman , Lax Chan , Kyle Vincent

We develop a weighted Bayesian Bootstrap (WBB) for machine learning and statistics. WBB provides uncertainty quantification by sampling from a high dimensional posterior distribution. WBB is computationally fast and scalable using only…

Methodology · Statistics 2021-04-06 Michael Newton , Nicholas G. Polson , Jianeng Xu

Estimating causal effects from large experimental and observational data has become increasingly prevalent in both industry and research. The bootstrap is an intuitive and powerful technique used to construct standard errors and confidence…

Methodology · Statistics 2023-02-07 Matthew Kosko , Lin Wang , Michele Santacatterina

Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…

Methodology · Statistics 2018-02-01 Siliang Gong , Kai Zhang , Yufeng Liu

Kernel methods are widely used in causal inference for tasks such as treatment effect estimation, policy evaluation, and policy learning. The bootstrap is a standard tool for uncertainty quantification because of its broad applicability. As…

Methodology · Statistics 2026-03-17 Matthew Kosko , Falco J , Bargagli-Stoffi , Lin Wang , Michele Santacatterina
‹ Prev 1 2 3 10 Next ›