On Identifying Anomalies in Tor Usage with Applications in Detecting Internet Censorship
Abstract
We develop a means to detect ongoing per-country anomalies in the daily usage metrics of the Tor anonymous communication network, and demonstrate the applicability of this technique to identifying likely periods of internet censorship and related events. The presented approach identifies contiguous anomalous periods, rather than daily spikes or drops, and allows anomalies to be ranked according to deviation from expected behaviour. The developed method is implemented as a running tool, with outputs published daily by mailing list. This list highlights per-country anomalous Tor usage, and produces a daily ranking of countries according to the level of detected anomalous behaviour. This list has been active since August 2016, and is in use by a number of individuals, academics, and NGOs as an early warning system for potential censorship events. We focus on Tor, however the presented approach is more generally applicable to usage data of other services, both individually and in combination. We demonstrate that combining multiple data sources allows more specific identification of likely Tor blocking events. We demonstrate the our approach in comparison to existing anomaly detection tools, and against both known historical internet censorship events and synthetic datasets. Finally, we detail a number of significant recent anomalous events and behaviours identified by our tool.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2015
- DOI:
- 10.48550/arXiv.1507.05819
- arXiv:
- arXiv:1507.05819
- Bibcode:
- 2015arXiv150705819W
- Keywords:
-
- Computer Science - Computers and Society;
- Computer Science - Machine Learning;
- Computer Science - Networking and Internet Architecture
- E-Print:
- To appear in ACM WebSci 2018