Although the sphere decoder (SD) is a powerful detector for multiple-input multiple-output (MIMO) systems, it has become computationally prohibitive in massive MIMO systems, where a large number of antennas are employed. To overcome this challenge, we propose fast deep learning (DL)-aided SD (FDL-SD) and fast DL-aided $K$-best SD (KSD, FDL-KSD) algorithms. Therein, the major application of DL is to generate a highly reliable initial candidate to accelerate the search in SD and KSD in conjunction with candidate/layer ordering and early rejection. Compared to existing DL-aided SD schemes, our proposed schemes are more advantageous in both offline training and online application phases. Specifically, unlike existing DL-aided SD schemes, they do not require performing the conventional SD in the training phase. For a $24 \times 24$ MIMO system with QPSK, the proposed FDL-SD achieves a complexity reduction of more than $90\%$ without any performance loss compared to conventional SD schemes. For a $32 \times 32$ MIMO system with QPSK, the proposed FDL-KSD only requires $K = 32$ to attain the performance of the conventional KSD with $K=256$, where $K$ is the number of survival paths in KSD. This implies a dramatic improvement in the performance--complexity tradeoff of the proposed FDL-KSD scheme.