Evaluation of Machine Learning Algorithms for Classification of Primary Biological Aerosol using a new UV-LIF spectrometer
Abstract
Characterisation of bio-aerosols has important implications within Environment and Public Health sectors. Recent developments in Ultra-Violet Light Induced Fluorescence (UV-LIF) detectors such as the Wideband Integrated bio-aerosol Spectrometer (WIBS) and the newly introduced Multiparameter bio-aerosol Spectrometer (MBS) has allowed for the real time collection of fluorescence, size and morphology measurements for the purpose of discriminating between bacteria, fungal Spores and pollen. This new generation of instruments has enabled ever-larger data sets to be compiled with the aim of studying more complex environments, yet the algorithms used for specie classification remain largely invalidated. It is therefore imperative that we validate the performance of different algorithms that can be used for the task of classification, which is the focus of this study. For unsupervised learning we test Hierarchical Agglomerative Clustering with various different linkages. For supervised learning, ten methods were tested; including decision trees, ensemble methods: Random Forests, Gradient Boosting and AdaBoost; two implementations for support vector machines: libsvm and liblinear; Gaussian methods: Gaussian naïve Bayesian, quadratic and linear discriminant analysis and finally the k-nearest neighbours algorithm. The methods were applied to two different data sets measured using a new Multiparameter bio-aerosol Spectrometer. We find that clustering, in general, performs slightly worse than the supervised learning methods correctly classifying, at best, only 72.7 and 91.1 percent for the two data sets. For supervised learning the gradient boosting algorithm was found to be the most effective, on average correctly classifying 88.1 and 97.8 percent of the testing data respectively across the two data sets. We discuss the wider relevance of these results with regards to challenging existing classification in real-world environments.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2016
- Bibcode:
- 2016AGUFM.A33E0281R
- Keywords:
-
- 0317 Chemical kinetic and photochemical properties;
- ATMOSPHERIC COMPOSITION AND STRUCTUREDE: 0340 Middle atmosphere: composition and chemistry;
- ATMOSPHERIC COMPOSITION AND STRUCTUREDE: 0365 Troposphere: composition and chemistry;
- ATMOSPHERIC COMPOSITION AND STRUCTUREDE: 0394 Instruments and techniques;
- ATMOSPHERIC COMPOSITION AND STRUCTURE