We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived by following an established information-theoretic approach to minimax estimation. The main conceptual contribution of this paper is the adaption of the information-theoretic approach to minimax estimation for the DL problem in order to derive lower bounds on the worst case MSE of any DL scheme. We derive three different lower bounds applying to different generative models for the observed signals. The first bound applies to a wide range of models, it only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions, and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL schemes in terms of a signal to noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. While, compared with the previous two bounds, the applicability of this final bound is the most limited it is the tightest of the three bounds in the low SNR regime.