Learning Modular Structures That Generalize Out-of-Distribution
Abstract
Out-of-distribution (O.O.D.) generalization remains to be a key challenge for real-world machine learning systems. We describe a method for O.O.D. generalization that, through training, encourages models to only preserve features in the network that are well reused across multiple training domains. Our method combines two complementary neuron-level regularizers with a probabilistic differentiable binary mask over the network, to extract a modular sub-network that achieves better O.O.D. performance than the original network. Preliminary evaluation on two benchmark datasets corroborates the promise of our method.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2022
- DOI:
- 10.48550/arXiv.2208.03753
- arXiv:
- arXiv:2208.03753
- Bibcode:
- 2022arXiv220803753A
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Artificial Intelligence
- E-Print:
- Accepted at AAAI 2022 Student Abstract and Poster Program