7–11 Oct 2024
University of Nova Gorica, Lanthieri mansion, Vipava, Slovenia
Europe/Ljubljana timezone

Upgrading Particulate matter Source apportIonment through Data sciencE

7 Oct 2024, 16:30
15m
University of Nova Gorica, Lanthieri mansion, Vipava, Slovenia

University of Nova Gorica, Lanthieri mansion, Vipava, Slovenia

Oral presentation Contributing talks

Speaker

MARTA VIA GONZALEZ (UNG-CAR)

Description

Particulate Matter (PM) has different severe impacts on human health and climate depending on its size and composition (Yang et al. (2018), Daellenbach et al. (2020)). Source apportionment (SA) is the process of identification of ambient air pollution sources and the quantification of their contribution to pollution levels, and is usually conducted through receptor models (RM). Their usual approach is to decompose the measurements into products of fingerprints (or profiles) and time series, based, respectively, on the chemical composition and time evolution of each of the sources of PM.

The most widely used RM is the Positive Matrix Factorisation (PMF) algorithm (Paatero and Tapper, 1994), although new methodologies are being developed. For instance, the novel Bayesian auto-correlated matrix factorisation method (BAMF, Rusanen et al. 2024) integrates an auto-correlation term emulating real-world pollutant sources time evolution, producing higher accuracy results than PMF. However, both PMF and BAMF struggle to provide well-separated profiles, leading, in turn, to mixed time series contributions. The UPSIDE project (Upgrading Particulate matter Source apportIonment through Data sciencE) aims to reduce profile separation difficulties on RMs through data science techniques.

For profile improvement, a sparsity-handling algorithm named horseshoe regularisation (Piironen and Vehtari, will be applied to BAMF to improve profile determination. The horseshoe prior encourages some parameters to be close to zero but allows others to have large values. This method reduces the dimensionality of the matter by scaling down the non-significant species for each profile. In such way, profiles are expected to be less noisy and, thus, portray the nature of the atmospheric pollution sources.

With the aim of testing BAMF capabilities, BAMF with horsehoe (BAMFh) will be applied firstly to different types of synthetic data. Aerosol synthetic datasets for source apportionment evaluation are limited and often are too simplistic to mimic the actual patterns of atmospheric sources. The second aim of this project is to generate synthetic datasets with machine learning, by merging real-world, chamber, and modelling atmospheric sources time series through the Rotation-Based Iterative Gaussianisation (RBIG) machine learning technique. RBIG, after its training phase, will provide a replicable time series of the input sources even if they stem from different databases.

Subsequently, BAMF with the regularized horseshoe will be applied to the generated synthetic datasets and a thorough parameter tuning seeking for optimal performance will be performed. Then, the outcomes of the BAMFh, BAMF, and PMF will be compared to test these RMs performance. The last phase of the project consists of the application of BAMFs to real-world data. The improved determination of air pollution sources is intended to be used as inputs for source-dependent health and climate studies.

References

Daellenbach, K. R. et al. (2020) Nature, 587(7834), 414-419.
Laparra, V et al. (2011). IEEE transactions on neural networks, 22(4), 537-549.
Paatero, P. and Tapper, U (1994), Environment., 5(2), 111-126.
Piironen and Vehtari (2017), Electron. J. Statist., 11(2): 5018-5051
Rusanen, A. et al. (2024), Atmos. Tech. Disc. 1-2828.
Yang, M. et al. (2018), Environ. Int., 120, 516-524

Acknowledgements

This work was supported by the SMASH project (No. 101081355), funded from the European Union’s Horizon Europe research and innovation programme under the Marie Sklodowska-Curie grant and ARIS programs I0-0033 and P1-0385.

Primary author

Co-authors

Dr Griša Močnik (UNG-CAR) Dr Kaspar Daellenbach (Paul Scherrer Institute)

Presentation materials