Over recent years, deep learning has become the mainstream data-driven approach to solve many important real-world problems. In the successful network architectures, shortcut connections are well established to take the outputs of earlier layers as additional inputs to later layers, which have produced excellent results. Despite the extraordinary effectiveness of shortcuts, there remain important questions on the underlying mechanism and associated functionalities. For example, why are shortcuts powerful? Why shortcuts generalize well? To address these questions, we investigate the representation and generalization ability of a sparse shortcut topology. Specifically, we first demonstrate that this topology can empower a one-neuron-wide deep network to approximate any univariate continuous function. Then, we present a novel width-bounded universal approximator in contrast to depth-bounded universal approximators, and also extend the approximation result to a family of networks such that in the view of approximation ability, these networks are equally competent. Furthermore, we use the generalization bound theory to show that the investigated shortcut topology enjoys an excellent generalizability. Finally, we corroborate our theoretical analyses with experiments on some well-known benchmarks.