SPEED: Scalable Preprocessing of EEG Data for Self-Supervised Learning

doi:10.48550/arXiv.2408.08065

SPEED: Scalable Preprocessing of EEG Data for Self-Supervised Learning

Electroencephalography (EEG) research typically focuses on tasks with narrowly defined objectives, but recent studies are expanding into the use of unlabeled data within larger models, aiming for a broader range of applications. This addresses a critical challenge in EEG research. For example, Kostas et al. (2021) show that self-supervised learning (SSL) outperforms traditional supervised methods. Given the high noise levels in EEG data, we argue that further improvements are possible with additional preprocessing. Current preprocessing methods often fail to efficiently manage the large data volumes required for SSL, due to their lack of optimization, reliance on subjective manual corrections, and validation processes or inflexible protocols that limit SSL. We propose a Python-based EEG preprocessing pipeline optimized for self-supervised learning, designed to efficiently process large-scale data. This optimization not only stabilizes self-supervised training but also enhances performance on downstream tasks compared to training with raw data.

Publication:

arXiv e-prints

Pub Date:

August 2024

DOI:

10.48550/arXiv.2408.08065

arXiv:

arXiv:2408.08065

Bibcode:

2024arXiv240808065G

Keywords:

Electrical Engineering and Systems Science - Signal Processing;
Computer Science - Artificial Intelligence

E-Print:

To appear in proceedings of 2024 IEEE International workshop on Machine Learning for Signal Processing

NASA/ADS

SPEED: Scalable Preprocessing of EEG Data for Self-Supervised Learning

Abstract