Related papers: Blocked Clusterwise Regression
Panel data allows for the modeling of unobserved heterogeneity, significantly raising the number of nuisance parameters and making high dimensionality a practical issue. Meanwhile, temporal and cross-sectional dependence in panel data…
We develop a structural framework for modeling and inferring unobserved heterogeneity in dynamic panel-data models. Unlike methods treating clustering as a descriptive device, we model heterogeneity as arising from a latent clustering…
Identifying a set of homogeneous clusters in a heterogeneous dataset is one of the most important classes of problems in statistical modeling. In the realm of unsupervised partitional clustering, k-means is a very important algorithm for…
Understanding treatment effect heterogeneity is vital for scientific and policy research. However, identifying and evaluating heterogeneous treatment effects pose significant challenges due to the typically unknown subgroup structure.…
A general framework for dealing with both linear regression and clustering problems is described. It includes Gaussian clusterwise linear regression analysis with random covariates and cluster analysis via Gaussian mixture models with…
Approximating time-varying unobserved heterogeneity by discrete types has become increasingly popular in economics. Yet, provably valid post-clustering inference for target parameters in models that do not impose an exact group structure is…
This paper proposes a selective inference procedure for testing equal predictive ability in panel data settings with unknown heterogeneity. The framework allows predictive performance to vary across unobserved clusters and accounts for the…
We study discrete panel data methods where unobserved heterogeneity is revealed in a first step, in environments where population heterogeneity is not discrete. We focus on two-step grouped fixed-effects (GFE) estimators, where individuals…
In the context of multilevel longitudinal data, where sample units are collected in clusters, an important aspect that should be accounted for is the unobserved heterogeneity between sample units and between clusters. For this aim we…
Panel data analysis is an important topic in statistics and econometrics. Traditionally, in panel data analysis, all individuals are assumed to share the same unknown parameters, e.g. the same coefficients of covariates when the linear…
Heterogeneity has been a hot topic in recent educational literature. Several calls have been voiced to adopt methods that capture different patterns or subgroups within students behavior or functioning. Assuming that there is an average…
We propose a framework for nonparametric identification and estimation of discrete choice models with unobserved choice sets. We recover the joint distribution of choice sets and preferences from a panel dataset on choices. We assume that…
High covariate dimensionality is increasingly occurrent in model estimation, and existing techniques to address this issue typically require sparsity or discrete heterogeneity of the \emph{unobservable} parameter vector. However, neither…
This paper addresses the problem of unsupervised clustering which remains one of the most fundamental challenges in machine learning and artificial intelligence. We propose the clustered generator model for clustering which contains both…
This study introduces a general semiparametric clusterwise index distribution model to analyze how latent clusters affect the covariate-response relationships. By employing sufficient dimension reduction to account for the effects of…
In many applications, data can be heterogeneous in the sense of spanning latent groups with different underlying distributions. When predictive models are applied to such data the heterogeneity can affect both predictive performance and…
Clustering has long been a popular unsupervised learning approach to identify groups of similar objects and discover patterns from unlabeled data in many applications. Yet, coming up with meaningful interpretations of the estimated clusters…
In this paper, a subgroup least squares and a convex clustering are introduced for inferring a partially heterogenous linear regression that has potential application in the areas of precision marketing and precision medicine. The…
One of the most important empirical findings in microeconometrics is the pervasiveness of heterogeneity in economic behaviour (cf. Heckman 2001). This paper shows that cumulative distribution functions and quantiles of the nonparametric…
Unsupervised machine learning, and in particular data clustering, is a powerful approach for the analysis of datasets and identification of characteristic features occurring throughout a dataset. It is gaining popularity across scientific…