Analysis of multiple data sequences with different distributions: defining common principal component axes by ergodic sequence generation and multiple reweighting composition
Abstract
Principal component analysis (PCA) defines a reduced space described by PC axes for a given multidimensional-data sequence to capture the variations of the data. In practice, we need multiple data sequences that accurately obey individual probability distributions and for a fair comparison of the sequences we need PC axes that are common for the multiple sequences but properly capture these multiple distributions. For these requirements, we present individual ergodic samplings for these sequences and provide special reweighting for recovering the target distributions.
- Publication:
-
IOP SciNotes
- Pub Date:
- September 2021
- DOI:
- arXiv:
- arXiv:2104.08141
- Bibcode:
- 2021IOPSN...2c5201F
- Keywords:
-
- principal component analysis;
- multiple data sequences;
- molecular dynamics;
- ergodic sampling;
- reweighting;
- Boltzmann-Gibbs distribution;
- statistical analysis;
- Statistics - Methodology;
- Physics - Biological Physics