Attention-wise masked graph contrastive learning for predicting molecular property
Abstract
Accurate and efficient prediction of the molecular properties of drugs is one of the fundamental problems in drug research and development. Recent advancements in representation learning have been shown to greatly improve the performance of molecular property prediction. However, due to limited labeled data, supervised learning-based molecular representation algorithms can only search limited chemical space, which results in poor generalizability. In this work, we proposed a self-supervised representation learning framework for large-scale unlabeled molecules. We developed a novel molecular graph augmentation strategy, referred to as attention-wise graph mask, to generate challenging positive sample for contrastive learning. We adopted the graph attention network (GAT) as the molecular graph encoder, and leveraged the learned attention scores as masking guidance to generate molecular augmentation graphs. By minimization of the contrastive loss between original graph and masked graph, our model can capture important molecular structure and higher-order semantic information. Extensive experiments showed that our attention-wise graph mask contrastive learning exhibit state-of-the-art performance in a couple of downstream molecular property prediction tasks.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2022
- DOI:
- 10.48550/arXiv.2206.08262
- arXiv:
- arXiv:2206.08262
- Bibcode:
- 2022arXiv220608262L
- Keywords:
-
- Quantitative Biology - Biomolecules;
- Computer Science - Machine Learning