Data Augmentation for Graph Classification

Jiajun Zhou; Jie Shen; Qi Xuan

doi:10.1145/3340531.3412086

Data Augmentation for Graph Classification

Social and Information Networks 2020-09-22 v1

Authors: Jiajun Zhou , Jie Shen , Qi Xuan

View on arXiv ↗ PDF ↗ DOI ↗

Abstract

Graph classification, which aims to identify the category labels of graphs, plays a significant role in drug classification, toxicity detection, protein analysis etc. However, the limitation of scale of benchmark datasets makes it easy for graph classification models to fall into over-fitting and undergeneralization. Towards this, we introduce data augmentation on graphs and present two heuristic algorithms: random mapping and motif-similarity mapping, to generate more weakly labeled data for small-scale benchmark datasets via heuristic modification of graph structures. Furthermore, we propose a generic model evolution framework, M-Evolve, which combines graph augmentation, data filtration and model retraining to optimize pre-trained graph classifiers. Experiments conducted on six benchmark datasets demonstrate that M-Evolve helps existing graph classification models alleviate over-fitting when training on small-scale benchmark datasets and yields an average improvement of 3-12% accuracy on graph classification tasks.

Keywords

data augmentation graph generation graph representation learning

Cite

@article{arxiv.2009.09863,
  title  = {Data Augmentation for Graph Classification},
  author = {Jiajun Zhou and Jie Shen and Qi Xuan},
  journal= {arXiv preprint arXiv:2009.09863},
  year   = {2020}
}

Comments

Accepted by CIKM 2020. arXiv admin note: substantial text overlap with arXiv:2007.05700

Data Augmentation for Graph Classification

Abstract

Keywords

Cite

Comments

Related papers