Multi-hop Attention Graph Neural Network

doi:10.48550/arXiv.2009.14332

Multi-hop Attention Graph Neural Network

Self-attention mechanism in graph neural networks (GNNs) led to state-of-the-art performance on many graph representation learning tasks. Currently, at every layer, attention is computed between connected pairs of nodes and depends solely on the representation of the two nodes. However, such attention mechanism does not account for nodes that are not directly connected but provide important network context. Here we propose Multi-hop Attention Graph Neural Network (MAGNA), a principled way to incorporate multi-hop context information into every layer of attention computation. MAGNA diffuses the attention scores across the network, which increases the receptive field for every layer of the GNN. Unlike previous approaches, MAGNA uses a diffusion prior on attention values, to efficiently account for all paths between the pair of disconnected nodes. We demonstrate in theory and experiments that MAGNA captures large-scale structural information in every layer, and has a low-pass effect that eliminates noisy high-frequency information from graph data. Experimental results on node classification as well as the knowledge graph completion benchmarks show that MAGNA achieves state-of-the-art results: MAGNA achieves up to 5.7 percent relative error reduction over the previous state-of-the-art on Cora, Citeseer, and Pubmed. MAGNA also obtains the best performance on a large-scale Open Graph Benchmark dataset. On knowledge graph completion MAGNA advances state-of-the-art on WN18RR and FB15k-237 across four different performance metrics.

Publication:

arXiv e-prints

Pub Date:

September 2020

DOI:

10.48550/arXiv.2009.14332

arXiv:

arXiv:2009.14332

Bibcode:

2020arXiv200914332W

Keywords:

Computer Science - Machine Learning;
Statistics - Machine Learning

E-Print:

Accepted by IJCAI 2021

NASA/ADS

Multi-hop Attention Graph Neural Network

Abstract