A fundamental concept in two-arm non-parametric survival analysis is the comparison of observed versus expected numbers of events on one of the treatment arms (the choice of which arm is arbitrary), where the expectation is taken assuming that the true survival curves in the two arms are identical. This concept is at the heart of the counting-process theory that provides a rigorous basis for methods such as the log-rank test. It is natural, therefore, to maintain this perspective when extending the log-rank test to deal with non-proportional hazards, for example by considering a weighted sum of the "observed - expected" terms, where larger weights are given to time periods where the hazard ratio is expected to favour the experimental treatment. In doing so, however, one may stumble across some rather subtle issues, related to the difficulty in ascribing a causal interpretation to hazard ratios, that may lead to strange conclusions. An alternative approach is to view non-parametric survival comparisons as permutation tests. With this perspective, one can easily improve on the efficiency of the log-rank test, whilst thoroughly controlling the false positive rate. In particular, for the field of immuno-oncology, where researchers often anticipate a delayed treatment effect, sample sizes could be substantially reduced without loss of power.