Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly
Abstract
When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. A principal curve acts as a nonlinear generalization of PCA, and the present paper proposes a novel algorithm to automatically and sequentially learn principal curves from data streams. We show that our procedure is supported by regret bounds with optimal sublinear remainder terms. A greedy local search implementation (called slpc, for sequential learning principal curves) that incorporates both sleeping experts and multi-armed bandit ingredients is presented, along with its regret computation and performance on synthetic and real-life data.
- Publication:
-
Entropy
- Pub Date:
- November 2021
- DOI:
- 10.3390/e23111534
- arXiv:
- arXiv:1805.07418
- Bibcode:
- 2021Entrp..23.1534L
- Keywords:
-
- sequential learning;
- principal curves;
- data streams;
- regret bounds;
- greedy algorithm;
- sleeping experts;
- Statistics - Machine Learning;
- Computer Science - Machine Learning;
- Mathematics - Statistics Theory
- E-Print:
- Entropy 2021