Related papers: A Partially Linear Framework for Massive Heterogen…
We consider an additive partially linear framework for modelling massive heterogeneous data. The major goal is to extract multiple common features simultaneously across all sub-populations while exploring heterogeneity of each…
A massive dataset often consists of a growing number of (potentially) heterogeneous sub-populations. This paper is concerned about testing various forms of heterogeneity arising from massive data. In a general nonparametric framework, a set…
In this paper, a subgroup least squares and a convex clustering are introduced for inferring a partially heterogenous linear regression that has potential application in the areas of precision marketing and precision medicine. The…
We consider a flexible semiparametric quantile regression model for analyzing high dimensional heterogeneous data. This model has several appealing features: (1) By considering different conditional quantiles, we may obtain a more complete…
We propose generalized additive partial linear models for complex data which allow one to capture nonlinear patterns of some covariates, in the presence of linear components. The proposed method improves estimation efficiency and increases…
We present a nonlinear regression framework based on tensor algebra tailored to high dimensional contexts where data is scarce. We exploit algebraic properties of a partial tensor product, namely the m-tensor product, to leverage structured…
This paper analyzes a semiparametric model of network formation in the presence of unobserved agent-specific heterogeneity. The objective is to identify and estimate the preference parameters associated with homophily on observed attributes…
Recently, high-dimensional heterogeneous data have attracted a lot of attention and discussion. Under heterogeneity, semiparametric regression is a popular choice to model data in statistics. In this paper, we take advantages of expectile…
Linear regression is a fundamental and popular statistical method. There are various kinds of linear regression, such as mean regression and quantile regression. In this paper, we propose a new one called distribution regression, which…
In this paper we propose a general series method to estimate a semiparametric partially linear varying coefficient model. We establish the consistency and \sqrtn-normality property of the estimator of the finite-dimensional parameters of…
Prediction polling is an increasingly popular form of crowdsourcing in which multiple participants estimate the probability or magnitude of some future event. These estimates are then aggregated into a single forecast. Historically,…
In many complex applications, data heterogeneity and homogeneity exist simultaneously. Ignoring either one will result in incorrect statistical inference. In addition, coping with complex data that are non-Euclidean becomes more common. To…
We consider efficient estimation of the Euclidean parameters in a generalized partially linear additive models for longitudinal/clustered data when multiple covariates need to be modeled nonparametrically, and propose an estimation…
In conventional statistical and machine learning methods, it is typically assumed that the test data are identically distributed with the training data. However, this assumption does not always hold, especially in applications where the…
This article introduces a novel nonparametric methodology for Generalized Linear Models which combines the strengths of the binary regression and latent variable formulations for categorical data, while overcoming their disadvantages.…
In this paper, we consider a partial deconvolution kernel estimator for nonparametric regression when some covariates are measured with error while others are observed without error. We focus on a general and realistic setting in which the…
A full parametric and linear specification may be insufficient to capture complicated patterns in studies exploring complex features, such as those investigating age-related changes in brain functional abilities. Alternatively, a partially…
Inference on the parametric part of a semiparametric model is no trivial task. If one approximates the infinite dimensional part of the semiparametric model by a parametric function, one obtains a parametric model that is in some sense…
I develop a methodology to partially identify linear combinations of conditional mean outcomes when the researcher only has access to aggregate data. Unlike the existing literature, I only allow for marginal, not joint, distributions of…
In this manuscript a unified framework for conducting inference on complex aggregated data in high dimensional settings is proposed. The data are assumed to be a collection of multiple non-Gaussian realizations with underlying undirected…