The problem of wideband massive MIMO channel estimation is considered. Targeting for low complexity algorithms as well as small training overhead, a compressive sensing (CS) approach is pursued. Unfortunately, due to the Kronecker-type sensing (measurement) matrix corresponding to this setup, application of standard CS algorithms and analysis methodology does not apply. By recognizing that the channel possesses a special structure, termed hierarchical sparsity, we propose an efficient algorithm that explicitly takes into account this property. In addition, by extending the standard CS analysis methodology to hierarchical sparse vectors, we provide a rigorous analysis of the algorithm performance in terms of estimation error as well as number of pilot subcarriers required to achieve it. Small training overhead, in turn, means higher number of supported users in a cell and potentially improved pilot decontamination. We believe, that this is the first paper that draws a rigorous connection between the hierarchical framework and Kronecker measurements. Numerical results verify the advantage of employing the proposed approach in this setting instead of standard CS algorithms.