You are here

Spatial statistical learning methods for estimating ambient air pollution

Principal Investigator: 

Utrecht University, Netherlands

Hoek and colleagues will prepare maps of modeled annual average air pollution across the Netherlands, validate the maps using new measurements from over 100 sites, and evaluate the performance of several exposure models. The investigators will conduct cross-comparisons to evaluate how different exposure assessment methods compare in their ability to predict long-term pollutant concentrations, with a particular focus on spatial variability of pollutants.

Funded under

Comparison of Long-Term Air Pollution Exposure Assessment Based On on Mobile Monitoring, Low-Cost Sensors, Dispersion Modelling And and Routine Monitoring-Based Models 

Gerard Hoek1, Femke Bouma1, Kees Meliefste1, Ulrike Gehring1, Roel Vermeulen1, Kees de Hoogh2, Sjoerd van Ratingen3, Wouter Hendrickx3, Erik Tielemans3, Nicole Janssen3, Joost Wesseling3 
1Institute for Risk Assessment Sciences, Utrecht University, the Netherlands. 
2Swiss Tropical and Public Health Institute, Switzerland 
3National Institute for Public Health and the Environment, the Netherlands

Background: Assessment of long-term exposure to traffic-related outdoor air pollution remains a major challenge for epidemiological studies. One challenge is the characterization of the spatial variation of the ambient concentrations of key traffic-related air pollutants including ultrafine particles (UFP), bBlack carbon (BC) and NO2 as these pollutants vary on a fine spatial scale. Recently epidemiological studies have used different approaches, including land use regression (LUR) models based upon mobile monitoring (UFP, BC), models based upon low cost sensors (PM2.5, NO2) or routine monitoring data and increasingly sophisticated hybrid models incorporating routine surface monitoring, satellite, chemical transport and land use data. Very little information is available about the relative performance of these different approaches to assess long-term exposure to traffic-related air pollution. Differences in performance may affect conclusions from epidemiological studies applying different exposure assessment approaches. 

The Specific Aims of the project are:

1.    Develop long-term ambient air pollution exposure estimates for selected epidemiological studies based upon low-cost sensors, mobile and stationary monitoring and deterministic dispersion modelling;
2.    Compare different exposure assessment methods in terms of their ability to predict spatial variation of long-term average concentrations using external validation data;
3.    Compare different exposure assessment methods in terms of air pollution effect estimates in selected epidemiological studies.

Methods: The project will generate and evaluate annual average air pollution maps using eight different exposure assessment methods, which differ in modeling approach (empirical LUR, deterministic dispersion models and hybrid models) and monitoring data used (low-cost sensors, mobile monitoring). For all empirical models we will test three model development algorithms covering major families of modeling approaches: supervised linear regression, random forest and elastic net. 

The predictions and performance of the models will be compared at 20,000 addresses across the Netherlands and tested on newly collected and existing external validation data. Epidemiological analyses in three cohort studies will be conducted to compare health risk estimates of the different exposure assessment methods. The studies include a national administrative cohort, a classical cohort study and a mature birth cohort in which we will assess mortality, cardiovascular disease incidence, and lung function/asthma, respectively.  

Results: The focus in the first year has been on the new data collection required for the low-cost sensor network and the external validation. Field measurements have been delayed due to the COVID-19 lockdown. Both are planned to start in June 2021. Epidemiological analyses of UFP are currently ongoing in the mature birth cohort and the national administrative cohort, investigating allergic sensitization and mortality respectively.  

Conclusions: As data collection has not started yet, no conclusions can be drawn currently.