Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs

We investigate the representation power of graph neural networks in the semi-supervised node classification task under heterophily or low homophily, i.e., in networks where connected nodes may have different class labels and dissimilar features. Many popular GNNs fail to generalize to this setting, and are even outperformed by models that ignore the graph structure (e.g., multilayer perceptrons). Motivated by this limitation, we identify a set of key designs -- ego- and neighbor-embedding separation, higher-order neighborhoods, and combination of intermediate representations -- that boost learning from the graph structure under heterophily. We combine them into a graph neural network, H2GCN, which we use as the base method to empirically evaluate the effectiveness of the identified designs. Going beyond the traditional benchmarks with strong homophily, our empirical analysis shows that the identified designs increase the accuracy of GNNs by up to 40% and 27% over models without them on synthetic and real networks with heterophily, respectively, and yield competitive performance under homophily.

PDF Abstract NeurIPS 2020 PDF NeurIPS 2020 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Node Classification Actor H2GCN-1 Accuracy 34.31 ± 1.31 # 41
Node Classification Actor H2GCN-2 Accuracy 34.49 ± 1.63 # 40
Node Classification Chameleon H2GCN-1 Accuracy 52.96 ± 2.09 # 51
Node Classification Chameleon H2GCN-2 Accuracy 58.38 ± 1.76 # 50
Node Classification on Non-Homophilic (Heterophilic) Graphs Chameleon (48%/32%/20% fixed splits) H2GCN 1:1 Accuracy 60.11 ± 2.15 # 24
Node Classification Chameleon (60%/20%/20% random splits) H2GCN 1:1 Accuracy 52.30 ± 0.48 # 32
Node Classification on Non-Homophilic (Heterophilic) Graphs Chameleon(60%/20%/20% random splits) H2GCN 1:1 Accuracy 52.30 ± 0.48 # 28
Node Classification Citeseer (48%/32%/20% fixed splits) H2GCN 1:1 Accuracy 77.11 ± 1.57 # 13
Node Classification Cora (48%/32%/20% fixed splits) H2GCN 1:1 Accuracy 87.87 ± 1.20 # 15
Node Classification Cora (60%/20%/20% random splits) H2GCN 1:1 Accuracy 87.52 ± 0.61 # 21
Node Classification Cornell H2GCN-1 Accuracy 78.11 ± 6.68 # 38
Node Classification Cornell H2GCN-2 Accuracy 79.46 ± 4.80 # 37
Node Classification on Non-Homophilic (Heterophilic) Graphs Cornell (48%/32%/20% fixed splits) H2GCN 1:1 Accuracy 82.70 ± 5.28 # 15
Node Classification on Non-Homophilic (Heterophilic) Graphs Cornell (60%/20%/20% random splits) H2GCN 1:1 Accuracy 86.23 ± 4.71 # 22
Node Classification Cornell (60%/20%/20% random splits) H2GCN 1:1 Accuracy 86.23 ± 4.71 # 22
Node Classification on Non-Homophilic (Heterophilic) Graphs Deezer-Europe H2GCN 1:1 Accuracy 67.22±0.90 # 5
Node Classification on Non-Homophilic (Heterophilic) Graphs Film(48%/32%/20% fixed splits) H2GCN 1:1 Accuracy 35.70 ± 1.00 # 19
Node Classification on Non-Homophilic (Heterophilic) Graphs Penn94 H2GCN 1:1 Accuracy 81.31 ± 0.60 # 16
Node Classification Penn94 H2GCN Accuracy 81.31 ± 0.60 # 16
Node Classification PubMed (48%/32%/20% fixed splits) H2GCN 1:1 Accuracy 89.49 ± 0.38 # 9
Node Classification PubMed (60%/20%/20% random splits) H2GCN 1:1 Accuracy 87.78 ± 0.28 # 26
Node Classification Squirrel H2GCN-1 Accuracy 28.98 ± 1.97 # 52
Node Classification Squirrel H2GCN-2 Accuracy 32.33 ± 1.94 # 50
Node Classification on Non-Homophilic (Heterophilic) Graphs Squirrel (48%/32%/20% fixed splits) H2GCN 1:1 Accuracy 36.48 ± 1.86 # 26
Node Classification Squirrel (60%/20%/20% random splits) H2GCN 1:1 Accuracy 30.39 ± 1.22 # 35
Node Classification Texas H2GCN-1 Accuracy 83.24 ± 7.07 # 35
Node Classification Texas H2GCN-2 Accuracy 80.00 ± 6.77 # 43
Node Classification Texas (60%/20%/20% random splits) H2GCN 1:1 Accuracy 85.90 ± 3.53 # 22
Node Classification on Non-Homophilic (Heterophilic) Graphs Texas(60%/20%/20% random splits) H2GCN 1:1 Accuracy 85.90 ± 3.53 # 20
Node Classification Wisconsin H2GCN-1 Accuracy 84.31 ± 3.70 # 37
Node Classification Wisconsin H2GCN-2 Accuracy 83.14 ± 4.26 # 39
Node Classification on Non-Homophilic (Heterophilic) Graphs Wisconsin (48%/32%/20% fixed splits) H2GCN 1:1 Accuracy 87.65 ± 4.98 # 10
Node Classification Wisconsin (60%/20%/20% random splits) H2GCN 1:1 Accuracy 87.5 ± 1.77 # 20
Node Classification on Non-Homophilic (Heterophilic) Graphs Wisconsin(60%/20%/20% random splits) H2GCN 1:1 Accuracy 87.5 ± 1.77 # 20

Methods


No methods listed for this paper. Add relevant methods here