LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision
Abstract
With increasing scale and complexity of cloud operations, automated detection of anomalies in monitoring data such as logs will be an essential part of managing future IT infrastructures. However, many methods based on artificial intelligence, such as supervised deep learning models, require large amounts of labeled training data to perform well. In practice, this data is rarely available because labeling log data is expensive, time-consuming, and requires a deep understanding of the underlying system. We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts. Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect. It is based on the attention mechanism and uses a custom objective function for weak supervision deep learning techniques that accounts for imbalanced data. Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2021
- DOI:
- 10.48550/arXiv.2111.01657
- arXiv:
- arXiv:2111.01657
- Bibcode:
- 2021arXiv211101657W
- Keywords:
-
- Computer Science - Machine Learning
- E-Print:
- Paper accepted on ICSOC 2021 and published on springer