A non-parametric extension of control variates is presented. These leverage gradient information on the sampling density to achieve substantial variance reduction. It is not required that the sampling density be normalised. The novel contribution of this work is based on two important insights; (i) a trade-off between random sampling and deterministic approximation and (ii) a new gradient-based function space derived from Stein's identity. Unlike classical control variates, our estimators achieve super-root-$n$ convergence, often requiring orders of magnitude fewer simulations to achieve a fixed level of precision. Theoretical and empirical results are presented, the latter focusing on integration problems arising in hierarchical models and models based on non-linear ordinary differential equations.