TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation
Abstract
Social networks, such as Twitter, form a heterogeneous information network (HIN) where nodes represent domain entities (e.g., user, content, advertiser, etc.) and edges represent one of many entity interactions (e.g, a user re-sharing content or "following" another). Interactions from multiple relation types can encode valuable information about social network entities not fully captured by a single relation; for instance, a user's preference for accounts to follow may depend on both user-content engagement interactions and the other users they follow. In this work, we investigate knowledge-graph embeddings for entities in the Twitter HIN (TwHIN); we show that these pretrained representations yield significant offline and online improvement for a diverse range of downstream recommendation and classification tasks: personalized ads rankings, account follow-recommendation, offensive content detection, and search ranking. We discuss design choices and practical challenges of deploying industry-scale HIN embeddings, including compressing them to reduce end-to-end model latency and handling parameter drift across versions.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2022
- DOI:
- 10.48550/arXiv.2202.05387
- arXiv:
- arXiv:2202.05387
- Bibcode:
- 2022arXiv220205387E
- Keywords:
-
- Computer Science - Social and Information Networks
- E-Print:
- Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2022)