English
Related papers

Related papers: Parallel inference for massive distributed spatial…

200 papers

The increased availability of massive data sets provides a unique opportunity to discover subtle patterns in their distributions, but also imposes overwhelming computational challenges. To fully utilize the information contained in big…

Statistics Theory · Mathematics 2018-04-12 Stanislav Volgushev , Shih-Kang Chao , Guang Cheng

Extreme environmental events frequently exhibit spatial and temporal dependence. These data are often modeled using max stable processes (MSPs). MSPs are computationally prohibitive to fit for as few as a dozen observations, with supposed…

Methodology · Statistics 2022-05-02 Emily C. Hector , Brian J. Reich

Recent years have seen a huge development in spatial modelling and prediction methodology, driven by the increased availability of remote-sensing data and the reduced cost of distributed-processing technology. It is well known that…

Computation · Statistics 2020-02-18 Andrew Zammit-Mangion , Jonathan Rougier

We propose a distributed method for simultaneous inference for datasets with sample size much larger than the number of covariates, i.e., N >> p, in the generalized linear models framework. When such datasets are too big to be analyzed…

Methodology · Statistics 2020-07-23 Lu Tang , Ling Zhou , Peter X. -K. Song

The last decade has witnessed an explosion in the development of models, theory and computational algorithms for "big data" analysis. In particular, distributed computing has served as a natural and dominating paradigm for statistical…

Machine Learning · Statistics 2018-11-02 Bayan Saparbayeva , Michael Minyi Zhang , Lizhen Lin

We propose a distributed bootstrap method for simultaneous inference on high-dimensional massive data that are stored and processed with many machines. The method produces an $\ell_\infty$-norm confidence region based on a…

Methodology · Statistics 2022-06-15 Yang Yu , Shih-Kang Chao , Guang Cheng

The rapid emergence of massive datasets in various fields poses a serious challenge to traditional statistical methods. Meanwhile, it provides opportunities for researchers to develop novel algorithms. Inspired by the idea of…

Computation · Statistics 2023-04-14 Yuan Gao , Weidong Liu , Hansheng Wang , Xiaozhou Wang , Yibo Yan , Riquan Zhang

Over the past decades, linear mixed models have attracted considerable attention in various fields of applied statistics. They are popular whenever clustered, hierarchical or longitudinal data are investigated. Nonetheless, statistical…

Methodology · Statistics 2021-09-20 Katarzyna Reluga , María José Lombardía , Stefan Andreas Sperlich

Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory…

Data Structures and Algorithms · Computer Science 2021-07-20 Lorenz Hübschle-Schneider , Peter Sanders

As computer clusters become more common and the size of the problems encountered in the field of AI grows, there is an increasing demand for efficient parallel inference algorithms. We consider the problem of parallel inference on large…

Artificial Intelligence · Computer Science 2012-05-14 Joseph E. Gonzalez , Yucheng Low , Carlos E. Guestrin , David O'Hallaron

Machine learning iterative imputation methods have been well accepted by researchers for imputing missing data, but they can be time-consuming when handling large datasets. To overcome this drawback, parallel computing strategies have been…

Applications · Statistics 2020-04-24 Shangzhi Hong , Yuqi Sun , Hanying Li , Henry S. Lynn

Multivariate spatially-oriented data sets are prevalent in the environmental and physical sciences. Scientists seek to jointly model multiple variables, each indexed by a spatial location, to capture any underlying spatial association for…

Methodology · Statistics 2021-08-19 Lu Zhang , Sudipto Banerjee

Designing scalable estimation algorithms is a core challenge in modern statistics. Here we introduce a framework to address this challenge based on parallel approximants, which yields estimators with provable properties that operate on the…

Methodology · Statistics 2023-08-04 Aritra Chakravorty , William S. Cleveland , Patrick J. Wolfe

With the rapid advances of data acquisition techniques, spatio-temporal data are becoming increasingly abundant in a diverse array of disciplines. Here we develop spatio-temporal regression methodology for analyzing large amounts of…

Methodology · Statistics 2021-12-01 Ting Fung Ma , Fangfang Wang , Jun Zhu , Anthony R. Ives , Katarzyna E. Lewińska

Distributed model fitting refers to the process of fitting a mathematical or statistical model to the data using distributed computing resources, such that computing tasks are divided among multiple interconnected computers or nodes, often…

Computation · Statistics 2024-06-04 Xiaofei Wu , Rongmei Liang , Fabio Roli , Marcello Pelillo , Jing Yuan

Recent developments in engineering techniques for spatial data collection such as geographic information systems have resulted in an increasing need for methods to analyze large spatial data sets. These sorts of data sets can be found in…

Methodology · Statistics 2020-08-14 Toshihiro Hirano

Estimating statistical models within sensor networks requires distributed algorithms, in which both data and computation are distributed across the nodes of the network. We propose a general approach for distributed learning based on…

Machine Learning · Computer Science 2012-07-03 Qiang Liu , Alexander Ihler

Spatial data are central to applications such as environmental monitoring and urban planning, but are often distributed across devices where privacy and communication constraints limit direct sharing. Federated modeling offers a practical…

Methodology · Statistics 2025-10-03 Jianwei Shi , Sameh Abdulah , Ying Sun , Marc G. Genton

For massive data sets, efficient computation commonly relies on distributed algorithms that store and process subsets of the data on different machines, minimizing communication costs. Our focus is on regression and classification problems…

Machine Learning · Statistics 2014-10-27 Xiangyu Wang , Peichao Peng , David Dunson

Due to the significant increase in the size of spatial data, it is essential to use distributed parallel processing systems to efficiently analyze spatial data. In this paper, we first study learned spatial data partitioning, which…

Databases · Computer Science 2023-06-21 Keizo Hori , Yuya Sasaki , Daichi Amagata , Yuki Murosaki , Makoto Onizuka
‹ Prev 1 2 3 10 Next ›