Methods and Materials: We investigated transferability of neural network-based de-identification sys-tems with and without domain generalization. We used two domain generalization approaches: a novel approach Joint-Domain Learning (JDL) as developed in this paper, and a state-of-the-art domain general-ization approach Common-Specific Decomposition (CSD) from the literature. First, we measured trans-ferability from a single external source. Second, we used two external sources and evaluated whether domain generalization can improve transferability of de-identification models across domains which rep-resent different note types from the same institution. Third, using two external sources with in-domain training data, we studied whether external source data are useful even in cases where sufficient in-domain training data are available. Finally, we investigated transferability of the de-identification mod-els across institutions. Results and Conclusions: We found transferability from a single external source gave inconsistent re-sults. Using additional external sources consistently yielded an F1-score of approximately 80%, but domain generalization was not always helpful to improve transferability. We also found that external sources were useful even in cases where in-domain training data were available by reducing the amount of needed in-domain training data or by improving performance. Transferability across institutions was differed by note type and annotation label. External sources from a different institution were also useful to further improve performance.