Connecting Satellites to Genes; a new machine learning approach to observe phytoplankton diversity in the global ocean
Abstract
Phytoplankton constitute a key component of marine ecosystem biodiversity. The observation of this biodiversity requires fine methods of morphological, chemical and molecular analysis. Space remote sensing also offers the possibility of obtaining high-resolution spatio-temporal variability of the main phytoplankton groups on a global scale. Between 2009 and 2013, the Tara Oceans circumglobal expedition traveled the ocean to measure a global ecosystem for the first time, embracing its complexity, from viruses to animals, genes to structures of organisms. Tara Oceans' database contains several types of information on phytoplankton biodiversity, specifically molecular and genomic data, and on the abiotic factors that govern their distribution in the global ocean. The combination of Tara Oceans data with satellite data associated with the development of new data processing and artificial intelligence promises to increase our knowledge of the marine ecosystem. Thus, our research aims to "describe the spatio-temporal variability on a global scale of the phytoplankton community and its biodiversity using genomics and satellite observations". The genomic and imaging data collected during Tara Oceans were used to develop a new methodology to identify phytoplankton groups from satellite data (GlobColour data), using a neural classifier and data interpolation algorithm called self-organizing maps (SOM). This approach will serve to deepen our knowledge of phytoplankton groups, their spatial and temporal boundaries and their diversity, and ultimately ally genomic information with the observation of the ocean from satellites. This classification methodology is based on the association between a satellite signal and the abundance of phytoplankton observed by genomic methods from Tara Oceans, while including the phytoplankton pigment composition which reflects the complexity of the phytoplankton community. After having cross-validated this methodology using only satellite data to estimate the phytoplankton groups' abundance, we obtain a performance which varies between 30 and 80% of good estimate depending on the phytoplankton group under study.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2020
- Bibcode:
- 2020AGUFMOS0230002E
- Keywords:
-
- 0555 Neural networks;
- fuzzy logic;
- machine learning;
- COMPUTATIONAL GEOPHYSICS;
- 4262 Ocean observing systems;
- OCEANOGRAPHY: GENERAL;
- 4299 General or miscellaneous;
- OCEANOGRAPHY: GENERAL;
- 4532 General circulation;
- OCEANOGRAPHY: PHYSICAL