Automated generation of web server fingerprints
Abstract
In this paper, we demonstrate that it is possible to automatically generate fingerprints for various web server types using multifactor Bayesian inference on randomly selected servers on the Internet, without building an a priori catalog of server features or behaviors. This makes it possible to conclusively study web server distribution without relying on reported (and variable) version strings. We gather data by sending a collection of specialized requests to 110,000 live web servers. Using only the server response codes, we then train an algorithm to successfully predict server types independently of the server version string. In the process, we note several distinguishing features of current web infrastructure.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2013
- DOI:
- 10.48550/arXiv.1305.0245
- arXiv:
- arXiv:1305.0245
- Bibcode:
- 2013arXiv1305.0245B
- Keywords:
-
- Computer Science - Cryptography and Security;
- Computer Science - Networking and Internet Architecture