Early screening of potential breakthrough technologies with enhanced interpretability: A patent-specific hierarchical attention network model
Abstract
Despite the usefulness of machine learning approaches for the early screening of potential breakthrough technologies, their practicality is often hindered by opaque models. To address this, we propose an interpretable machine learning approach to predicting future citation counts from patent texts using a patent-specific hierarchical attention network (PatentHAN) model. Central to this approach are (1) a patent-specific pre-trained language model, capturing the meanings of technical words in patent claims, (2) a hierarchical network structure, enabling detailed analysis at the claim level, and (3) a claim-wise self-attention mechanism, revealing pivotal claims during the screening process. A case study of 35,376 pharmaceutical patents demonstrates the effectiveness of our approach in early screening of potential breakthrough technologies while ensuring interpretability. Furthermore, we conduct additional analyses using different language models and claim types to examine the robustness of the approach. It is expected that the proposed approach will enhance expert-machine collaboration in identifying breakthrough technologies, providing new insight derived from text mining into technological value.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2024
- DOI:
- 10.48550/arXiv.2407.16939
- arXiv:
- arXiv:2407.16939
- Bibcode:
- 2024arXiv240716939C
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Artificial Intelligence