Equivalence of the Empirical Risk Minimization to Regularization on the Family of f-Divergences

doi:10.48550/arXiv.2402.00501

Equivalence of the Empirical Risk Minimization to Regularization on the Family of f-Divergences

The solution to empirical risk minimization with $f$-divergence regularization (ERM-$f$DR) is presented under mild conditions on $f$. Under such conditions, the optimal measure is shown to be unique. Examples of the solution for particular choices of the function $f$ are presented. Previously known solutions to common regularization choices are obtained by leveraging the flexibility of the family of $f$-divergences. These include the unique solutions to empirical risk minimization with relative entropy regularization (Type-I and Type-II). The analysis of the solution unveils the following properties of $f$-divergences when used in the ERM-$f$DR problem: $i\bigl)$ $f$-divergence regularization forces the support of the solution to coincide with the support of the reference measure, which introduces a strong inductive bias that dominates the evidence provided by the training data; and $ii\bigl)$ any $f$-divergence regularization is equivalent to a different $f$-divergence regularization with an appropriate transformation of the empirical risk function.

Publication:

arXiv e-prints

Pub Date:

February 2024

DOI:

10.48550/arXiv.2402.00501

arXiv:

arXiv:2402.00501

Bibcode:

2024arXiv240200501D

Keywords:

Statistics - Machine Learning;
Computer Science - Information Theory;
Computer Science - Machine Learning

E-Print:

Submitted to the IEEE Symposium in Information Theory 2024. arXiv admin note: text overlap with arXiv:2306.07123

NASA/ADS

Equivalence of the Empirical Risk Minimization to Regularization on the Family of f-Divergences

Abstract