We provide benchmark comparisons of two finite element (FE) mantle convection codes, CITCOM and CONMAN, against analytic solutions for Stokes' flow for strongly varying viscosity in the horizontal and vertical directions. The two codes use a similar FE formulation but different methods for solving the resulting equations. They both obtain accurate velocity, pressure and surface topography for viscosity structures which vary rapidly over a short distance, or discontinuously. The benchmarks help determine how many elements are needed to resolve a region of, for example, a convection simulation with high viscosity gradients. The overall accuracy does not depend on the global viscosity variation but on the gradients within individual elements. As a rule of thumb, accuracy can fall below 1% when there is a viscosity variation greater than a factor of two or three in an element. For iterative solution methods, necessary in three-dimensional modelling, these guidelines are required to determine the number of iterations to perform. We discuss a penalty based technique which improves the convergence of iterative solvers for general problems in which high viscosity gradients occurs spontaneously.