Related papers: Distributed inference for quantile regression proc…
Quantiles are very important statistics information used to describe the distribution of datasets. Given the quantiles of a dataset, we can easily know the distribution of the dataset, which is a fundamental problem in data analysis.…
We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the…
Big data analytics has opened new avenues in economic research, but the challenge of analyzing datasets with tens of millions of observations is substantial. Conventional econometric methods based on extreme estimators require large amounts…
Quantiles and expected shortfalls are commonly used risk measures in financial risk management. The two measurements are correlated while have distinguished features. In this project, our primary goal is to develop stable and practical…
We introduce a new methodology for analyzing serial data by quantile regression assuming that the underlying quantile function consists of constant segments. The procedure does not rely on any distributional assumption besides serial…
Quantile regression is a method to estimate the quantiles of the conditional distribution of a response variable, and as such it permits a much more accurate portrayal of the relationship between the response variable and observed…
For massive data stored at multiple machines, we propose a distributed subsampling procedure for the composite quantile regression. By establishing the consistency and asymptotic normality of the composite quantile regression estimator from…
The rapid emergence of massive datasets in various fields poses a serious challenge to traditional statistical methods. Meanwhile, it provides opportunities for researchers to develop novel algorithms. Inspired by the idea of…
Due to rapid data growth, statistical analysis of massive datasets often has to be carried out in a distributed fashion, either because several datasets stored in separate physical locations are all relevant to a given problem, or simply to…
We propose a robust inferential procedure for assessing uncertainties of parameter estimation in high-dimensional linear models, where the dimension $p$ can grow exponentially fast with the sample size $n$. Our method combines the…
In this paper, we address the problem of conducting statistical inference in settings involving large-scale data that may be high-dimensional and contaminated by outliers. The high volume and dimensionality of the data require distributed…
To analyse a very large data set containing lengthy variables, we adopt a sequential estimation idea and propose a parallel divide-and-conquer method. We conduct several conventional sequential estimation procedures separately, and properly…
As inductive inference and machine learning methods in computer science see continued success, researchers are aiming to describe ever more complex probabilistic models and inference algorithms. It is natural to ask whether there is a…
We address the problem of how to achieve optimal inference in distributed quantile regression without stringent scaling conditions. This is challenging due to the non-smooth nature of the quantile regression (QR) loss function, which…
Quantile regression models provide a wide picture of the conditional distributions of the response variable by capturing the effect of the covariates at different quantile levels. In most applications, the parametric form of those…
Estimation and inference with modern longitudinal data from wearable devices, which consist of biological signals at high-frequency time points, is burdened by massive computational costs. We propose a distributed estimation and inference…
Designing scalable estimation algorithms is a core challenge in modern statistics. Here we introduce a framework to address this challenge based on parallel approximants, which yields estimators with provable properties that operate on the…
In this paper, we focus on distributed estimation and support recovery for high-dimensional linear quantile regression. Quantile regression is a popular alternative tool to the least squares regression for robustness against outliers and…
As datasets grow larger, they are often distributed across multiple machines that compute in parallel and communicate with a central machine through short messages. In this paper, we focus on sparse regression and propose a new procedure…
In modern scientific research, massive datasets with huge numbers of observations are frequently encountered. To facilitate the computational process, a divide-and-conquer scheme is often used for the analysis of big data. In such a…