We propose a topological learning algorithm for the estimation of the conditional dependency structure of large sets of random variables from sparse and noisy data. The algorithm, named Maximally Filtered Clique Forest (MFCF), produces a clique forest and an associated Markov Random Field (MRF) by generalising Prim's minimum spanning tree algorithm. To the best of our knowledge, the MFCF presents three elements of novelty with respect to existing structure learning approaches. The first is the repeated application of a local topological move, the clique expansion, that preserves the decomposability of the underlying graph. Through this move the decomposability and calculation of scores is performed incrementally at the variable (rather than edge) level, and this provides better computational performance and an intuitive application of multivariate statistical tests. The second is the capability to accommodate a variety of score functions and, while this paper is focused on multivariate normal distributions, it can be directly generalised to different types of statistics. Finally, the third is the variable range of allowed clique sizes which is an adjustable topological constraint that acts as a topological penalizer providing a way to tackle sparsity at $l_0$ semi-norm level; this allows a clean decoupling of structure learning and parameter estimation. The MFCF produces a representation of the clique forest, together with a perfect ordering of the cliques and a perfect elimination ordering for the vertices. As an example we propose an application to covariance selection models and we show that the MCFC outperforms the Graphical Lasso for a number of classes of matrices.