We present a scalable approach for simplifying the stability analysis of cluster synchronization patterns on directed networks. When a network has directional couplings, decomposition of the coupling matrix into independent blocks (which in turn decouples the variational equation) is no longer adequate to reveal the full relations among perturbation modes. Instead, it is often necessary to introduce directional dependencies among the blocks and establish hierarchies among perturbation modes. For this purpose, we develop an algorithm that finds the simultaneous block upper triangularization of sets of asymmetric matrices, which generalizes the Jordan canonical decomposition from a single matrix to an arbitrary number of matrices. The block upper triangularization orders subspaces of the variational equation in a directional manner, allowing the stability of perturbation modes to be analyzed in sequence. We show that our algorithm gives the greatest possible simplification under mild assumptions, both in terms of the sizes of the blocks and in terms of the number of nonzero upper triangular entries linking the blocks.