Recommender systems learn from historical users' feedback that is often non-uniformly distributed across items. As a consequence, these systems may end up suggesting popular items more than niche items progressively, even when the latter would be of interest for users. This can hamper several core qualities of the recommended lists (e.g., novelty, coverage, diversity), impacting on the future success of the underlying platform itself. In this paper, we formalize two novel metrics that quantify how much a recommender system equally treats items along the popularity tail. The first one encourages equal probability of being recommended across items, while the second one encourages true positive rates for items to be equal. We characterize the recommendations of representative algorithms by means of the proposed metrics, and we show that the item probability of being recommended and the item true positive rate are biased against the item popularity. To promote a more equal treatment of items along the popularity tail, we propose an in-processing approach aimed at minimizing the biased correlation between user-item relevance and item popularity. Extensive experiments show that, with small losses in accuracy, our popularity-mitigation approach leads to important gains in beyond-accuracy recommendation quality.