Climate model ensembles are used to estimate uncertainty in future projections, typically by interpreting the ensemble distribution for a particular variable probabilistically. There are, however, different ways to produce climate model ensembles that yield different results, and therefore different probabilities for a future change in a variable. Perhaps equally importantly, there are different approaches to interpreting the ensemble distribution that lead to different conclusions. Here we use a reduced-resolution climate system model to compare three common ways to generate ensembles: initial conditions perturbation, physical parameter perturbation, and structural changes. Despite these three approaches conceptually representing very different categories of uncertainty within a modelling system, when comparing simulations to observations of surface air temperature they can be very difficult to separate. Using the twentieth century CMIP5 ensemble for comparison, we show that initial conditions ensembles, in theory representing internal variability, significantly underestimate observed variance. Structural ensembles, perhaps less surprisingly, exhibit over-dispersion in simulated variance. We argue that future climate model ensembles may need to include parameter or structural perturbation members in addition to perturbed initial conditions members to ensure that they sample uncertainty due to internal variability more completely. We note that where ensembles are over- or under-dispersive, such as for the CMIP5 ensemble, estimates of uncertainty need to be treated with care.