Related papers: Network Cluster-Robust Inference
We propose a structure-preserving model-reduction methodology for large-scale dynamic networks with tightly-connected components. First, the coherent groups are identified by a spectral clustering algorithm on the graph Laplacian matrix…
It is common when using cross-section or panel data to assign each observation to a cluster and allow for arbitrary patterns of heteroskedasticity and correlation within clusters. For regression models, there are many ways to make…
This paper presents and analyzes an approach to cluster-based inference for dependent data. The primary setting considered here is with spatially indexed data in which the dependence structure of observed random variables is characterized…
We propose a novel model-reduction methodology for large-scale dynamic networks with tightly-connected components. First, the coherent groups are identified by a spectral clustering algorithm on the graph Laplacian matrix that models the…
Measuring graph clustering quality remains an open problem. To address it, we introduce quality measures based on comparisons of intra- and inter-cluster densities, an accompanying statistical test of the significance of their differences…
This paper considers inference when there is a single treated cluster and a fixed number of control clusters, a setting that is common in empirical work, especially in difference-in-differences designs. We use the t-statistic and develop…
Local clustering aims to identify specific substructures within a large graph without any additional structural information of the graph. These substructures are typically small compared to the overall graph, enabling the problem to be…
Clusters or communities can provide a coarse-grained description of complex systems at multiple scales, but their detection remains challenging in practice. Community detection methods often define communities as dense subgraphs, or…
Due to the massive size of modern network data, local algorithms that run in sublinear time for analyzing the cluster structure of the graph are receiving growing interest. Two typical examples are local graph clustering algorithms that…
Graphs are commonly used to represent and visualize causal relations. For a small number of variables, this approach provides a succinct and clear view of the scenario at hand. As the number of variables under study increases, the graphical…
Community detection, which focuses on clustering nodes or detecting communities in (mostly) a single network, is a problem of considerable practical interest and has received a great deal of attention in the research community. While being…
Conventional cluster-robust inference can be invalid when data contain clusters of unignorably large size. We formalize this issue by deriving a necessary and sufficient condition for its validity, and show that this condition is frequently…
Modularity is designed to measure the strength of division of a network into clusters (known also as communities). Networks with high modularity have dense connections between the vertices within clusters but sparse connections between…
An approach to improve neural network interpretability is via clusterability, i.e., splitting a model into disjoint clusters that can be studied independently. We define a measure for clusterability and show that pre-trained models form…
Low-fidelity data is typically inexpensive to generate but inaccurate. On the other hand, high-fidelity data is accurate but expensive to obtain. Multi-fidelity methods use a small set of high-fidelity data to enhance the accuracy of a…
Analysis of higher-order organizations, usually small connected subgraphs called motifs, is a fundamental task on complex networks. This paper studies a new problem of testing higher-order clusterability: given query access to an undirected…
Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes…
Current modularity-based community detection algorithms attempt to find cluster memberships that maximize modularity within a fixed graph topology. Diverging from this conventional approach, our work introduces a novel strategy that employs…
We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the…
Spectral clustering is one of the most popular clustering methods for finding clusters in a graph, which has found many applications in data mining. However, the input graph in those applications may have many missing edges due to error in…