Random Walks on Directed Networks: Inference and Respondent-driven Sampling
Abstract
Respondent driven sampling (RDS) is a method often used to estimate population properties (e.g. sexual risk behavior) in hard-to-reach populations. It combines an effective modified snowball sampling methodology with an estimation procedure that yields unbiased population estimates under the assumption that the sampling process behaves like a random walk on the social network of the population. Current RDS estimation methodology assumes that the social network is undirected, i.e. that all edges are reciprocal. However, empirical social networks in general also have non-reciprocated edges. To account for this fact, we develop a new estimation method for RDS in the presence of directed edges on the basis of random walks on directed networks. We distinguish directed and undirected edges and consider the possibility that the random walk returns to its current position in two steps through an undirected edge. We derive estimators of the selection probabilities of individuals as a function of the number of outgoing edges of sampled individuals. We evaluate the performance of the proposed estimators on artificial and empirical networks to show that they generally perform better than existing methods. This is in particular the case when the fraction of directed edges in the network is large.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2013
- DOI:
- 10.48550/arXiv.1308.3600
- arXiv:
- arXiv:1308.3600
- Bibcode:
- 2013arXiv1308.3600M
- Keywords:
-
- Statistics - Methodology;
- Computer Science - Social and Information Networks;
- Physics - Physics and Society
- E-Print:
- 31 pages, 5 figures, 4 tables