Active Learning Framework to Automate NetworkTraffic Classification
Abstract
Recent network traffic classification methods benefitfrom machine learning (ML) technology. However, there aremany challenges due to use of ML, such as: lack of high-qualityannotated datasets, data-drifts and other effects causing aging ofdatasets and ML models, high volumes of network traffic etc. Thispaper argues that it is necessary to augment traditional workflowsof ML training&deployment and adapt Active Learning concepton network traffic analysis. The paper presents a novel ActiveLearning Framework (ALF) to address this topic. ALF providesprepared software components that can be used to deploy an activelearning loop and maintain an ALF instance that continuouslyevolves a dataset and ML model automatically. The resultingsolution is deployable for IP flow-based analysis of high-speed(100 Gb/s) networks, and also supports research experiments ondifferent strategies and methods for annotation, evaluation, datasetoptimization, etc. Finally, the paper lists some research challengesthat emerge from the first experiments with ALF in practice.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2022
- DOI:
- 10.48550/arXiv.2211.08399
- arXiv:
- arXiv:2211.08399
- Bibcode:
- 2022arXiv221108399P
- Keywords:
-
- Computer Science - Networking and Internet Architecture;
- Computer Science - Artificial Intelligence;
- Computer Science - Machine Learning