Physics-semantic Data Augmentation via Disentangled Representation Learning

Physics-semantic Data Augmentation via Disentangled Representation Learning

Data-driven techniques and celebrated deep learning have demonstrated compelling results in the domain of scientific problems. Sufficient and well-processed data is usually characterized as a de facto precursor for these approaches to succeed. The significance of this issue is exacerbated due to the high cost of data acquisition through complicated physical experiments, instruments, and simulations, thus motivating two points of desiderata: representation learning and augmentation of the data. In this work, we propose to employ disentangled representation learning with generative models (Variational Autoencoder and GAN) to achieve both. On one hand, a disentangled representation learns independent generative factors of the data, which can be interpreted as a certain physics semantics; on the other hand, the generative model achieves data augmentation by sampling, whilst also adaptive to the learned disentangled features. We demonstrate the results on the Kimberlina CO2 Leakage Dataset, with disentangled representations, reconstructed images, and sampled images for augmentation. We also provide baselines of the prediction performance with augmented data on various physics semantics.

Publication:: AGU Fall Meeting Abstracts
Pub Date:: December 2021
Bibcode:: 2021AGUFMIN25A0448D

NASA/ADS