Related papers: Distributed inference for quantile regression proc…

An Experimental Study of Distributed Quantile Estimation

Quantiles are very important statistics information used to describe the distribution of datasets. Given the quantiles of a dataset, we can easily know the distribution of the dataset, which is a fundamental problem in data analysis.…

Databases · Computer Science 2015-08-25 Zixuan Zhuang

Dual Regression

We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the…

Methodology · Statistics 2018-09-26 Richard Spady , Sami Stouli

Fast Inference for Quantile Regression with Tens of Millions of Observations

Big data analytics has opened new avenues in economic research, but the challenge of analyzing datasets with tens of millions of observations is substantial. Conventional econometric methods based on extreme estimators require large amounts…

Econometrics · Economics 2023-11-02 Sokbae Lee , Yuan Liao , Myung Hwan Seo , Youngki Shin

Inference for Joint Quantile and Expected Shortfall Regression

Quantiles and expected shortfalls are commonly used risk measures in financial risk management. The two measurements are correlated while have distinguished features. In this project, our primary goal is to develop stable and practical…

Methodology · Statistics 2022-08-24 Xiang Peng , Huixia Judy Wang

Multiscale quantile segmentation

We introduce a new methodology for analyzing serial data by quantile regression assuming that the underlying quantile function consists of constant segments. The procedure does not rely on any distributional assumption besides serial…

Methodology · Statistics 2020-09-09 Laura Jula Vanegas , Merle Behr , Axel Munk

Quantile Regression for Large-scale Applications

Quantile regression is a method to estimate the quantiles of the conditional distribution of a response variable, and as such it permits a much more accurate portrayal of the relationship between the response variable and observed…

Data Structures and Algorithms · Computer Science 2014-01-08 Jiyan Yang , Xiangrui Meng , Michael W. Mahoney

Optimal subsampling algorithm for composite quantile regression with distributed data

For massive data stored at multiple machines, we propose a distributed subsampling procedure for the composite quantile regression. By establishing the consistency and asymptotic normality of the composite quantile regression estimator from…

Computation · Statistics 2023-01-09 Xiaohui Yuan , Shiting Zhou , Yue Wang

A review of distributed statistical inference

The rapid emergence of massive datasets in various fields poses a serious challenge to traditional statistical methods. Meanwhile, it provides opportunities for researchers to develop novel algorithms. Inspired by the idea of…

Computation · Statistics 2023-04-14 Yuan Gao , Weidong Liu , Hansheng Wang , Xiaozhou Wang , Yibo Yan , Riquan Zhang

Parallel inference for massive distributed spatial data using low-rank models

Due to rapid data growth, statistical analysis of massive datasets often has to be carried out in a distributed fashion, either because several datasets stored in separate physical locations are all relevant to a given problem, or simply to…

Computation · Statistics 2016-02-08 Matthias Katzfuss , Dorit Hammerling

A General Framework for Robust Testing and Confidence Regions in High-Dimensional Quantile Regression

We propose a robust inferential procedure for assessing uncertainties of parameter estimation in high-dimensional linear models, where the dimension $p$ can grow exponentially fast with the sample size $n$. Our method combines the…

Machine Learning · Statistics 2015-03-19 Tianqi Zhao , Mladen Kolar , Han Liu

Two-Stage Robust and Sparse Distributed Statistical Inference for Large-Scale Data

In this paper, we address the problem of conducting statistical inference in settings involving large-scale data that may be high-dimensional and contaminated by outliers. The high volume and dimensionality of the data require distributed…

Machine Learning · Statistics 2022-11-30 Emadaldin Mozafari-Majd , Visa Koivunen

Distributed sequential method for analyzing massive data

To analyse a very large data set containing lengthy variables, we adopt a sequential estimation idea and propose a parallel divide-and-conquer method. We conduct several conventional sequential estimation procedures separately, and properly…

Methodology · Statistics 2018-12-27 Zhanfeng Wang , Yuan-chin Ivan Chang

On the computability of conditional probability

As inductive inference and machine learning methods in computer science see continued success, researchers are aiming to describe ever more complex probabilistic models and inference algorithms. It is natural to ask whether there is a…

Logic · Mathematics 2019-11-19 Nathanael L. Ackerman , Cameron E. Freer , Daniel M. Roy

Communication-Constrained Distributed Quantile Regression with Optimal Statistical Guarantees

We address the problem of how to achieve optimal inference in distributed quantile regression without stringent scaling conditions. This is challenging due to the non-smooth nature of the quantile regression (QR) loss function, which…

Methodology · Statistics 2022-08-24 Kean Ming Tan , Heather Battey , Wen-Xin Zhou

Pyramid quantile regression

Quantile regression models provide a wide picture of the conditional distributions of the response variable by capturing the effect of the covariates at different quantile levels. In most applications, the parametric form of those…

Methodology · Statistics 2017-11-03 T. Rodrigues , J. -L. Dortet-Bernadet , Y. Fan

Functional Regression with Intensively Measured Longitudinal Outcomes: A New Lens through Data Partitioning

Estimation and inference with modern longitudinal data from wearable devices, which consist of biological signals at high-frequency time points, is burdened by massive computational costs. We propose a distributed estimation and inference…

Methodology · Statistics 2023-09-13 Cole Manschot , Emily C. Hector

Distributed estimation through parallel approximants

Designing scalable estimation algorithms is a core challenge in modern statistics. Here we introduce a framework to address this challenge based on parallel approximants, which yields estimators with provable properties that operate on the…

Methodology · Statistics 2023-08-04 Aritra Chakravorty , William S. Cleveland , Patrick J. Wolfe

Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery

In this paper, we focus on distributed estimation and support recovery for high-dimensional linear quantile regression. Quantile regression is a popular alternative tool to the least squares regression for robustness against outliers and…

Machine Learning · Statistics 2024-06-04 Caixing Wang , Ziliang Shen

Selective Inference with Distributed Data

As datasets grow larger, they are often distributed across multiple machines that compute in parallel and communicate with a central machine through short messages. In this paper, we focus on sparse regression and propose a new procedure…

Methodology · Statistics 2023-03-14 Sifan Liu , Snigdha Panigrahi

On the Feasibility of Distributed Kernel Regression for Big Data

In modern scientific research, massive datasets with huge numbers of observations are frequently encountered. To facilitate the computational process, a divide-and-conquer scheme is often used for the analysis of big data. In such a…

Machine Learning · Statistics 2015-05-06 Chen Xu , Yongquan Zhang , Runze Li