Source Attribution of VOCs in the Canadian Oil Sands using Hierarchical Clustering
Abstract
Recent advances in analysis methodology for volatile organic compounds provide individual spectra on one-second intervals, in turn allowing airborne observations of unprecedented detail for enclosure flights around petrochemical facilities. However, determining the extent to which a facility or within-facility processes may be uniquely identified is a potentially difficult process, due to the large amounts of data from airborne studies.
We use the dissimilarity analysis method known as Hierarchical Clustering to identify the extent to which a large number of VOC spectra are identifiable as coming from unique petrochemical facilities and activities. Spectra were collected using a Proton Transfer Mass Spectrometer and an Iodide-Chemical Ionization Mass Spectometer during 30 flights in the Oil Sands 2018 study, which included both winter (early April) and summer (May through July) periods sampling petrochemical emissions from the Canadian Oil Sands. A highly parallelized version of a hierarchical clustering code, running on a Cray supercomputer, was used to analyze the approximately 486,000 spectra per instrument. The metric of dissimilarity was (1-R)xEuD, where R is the Pearson correlation coefficient, and EuD is the Euclidean distance, combining both similarity of shape of the spectra and magnitude of the spectra's components. The resulting clusters across flights and within specific flights, at different thresholds of the metric, were mapped onto flight trajectories, to differentiate source-specific plumes. Results were compared to air-quality model (GEM-MACH) simulated wind fields and plumes, to aid in potential source attribution. Spectra contained within observed plumes matching model plumes in time and location identified some sources. In other cases, where no model plume matched the observations, the spatial location of the spectra along with modelled and observed wind fields were used to infer possible sources. The work highlights the use of a massively parallel clustering code in analyzing very large geoscientific information datasets. Other possible applications of this methodology will be discussed.- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2019
- Bibcode:
- 2019AGUFM.A43M2918M
- Keywords:
-
- 0305 Aerosols and particles;
- ATMOSPHERIC COMPOSITION AND STRUCTURE;
- 0345 Pollution: urban and regional;
- ATMOSPHERIC COMPOSITION AND STRUCTURE;
- 3315 Data assimilation;
- ATMOSPHERIC PROCESSES;
- 3360 Remote sensing;
- ATMOSPHERIC PROCESSES