We study the $k$-server problem in the resource augmentation setting i.e., when the performance of the online algorithm with $k$ servers is compared to the offline optimal solution with $h \leq k$ servers. The problem is very poorly understood beyond uniform metrics. For this special case, the classic $k$-server algorithms are roughly $(1+1/\epsilon)$-competitive when $k=(1+\epsilon) h$, for any $\epsilon >0$. Surprisingly however, no $o(h)$-competitive algorithm is known even for HSTs of depth 2 and even when $k/h$ is arbitrarily large. We obtain several new results for the problem. First we show that the known $k$-server algorithms do not work even on very simple metrics. In particular, the Double Coverage algorithm has competitive ratio $\Omega(h)$ irrespective of the value of $k$, even for depth-2 HSTs. Similarly the Work Function Algorithm, that is believed to be optimal for all metric spaces when $k=h$, has competitive ratio $\Omega(h)$ on depth-3 HSTs even if $k=2h$. Our main result is a new algorithm that is $O(1)$-competitive for constant depth trees, whenever $k =(1+\epsilon )h$ for any $\epsilon > 0$. Finally, we give a general lower bound that any deterministic online algorithm has competitive ratio at least 2.4 even for depth-2 HSTs and when $k/h$ is arbitrarily large. This gives a surprising qualitative separation between uniform metrics and depth-2 HSTs for the $(h,k)$-server problem, and gives the strongest known lower bound for the problem on general metrics.