How to predict on-road air pollution based on street view images and machine learning: a quantitative analysis of the optimal strategy

Read original: arXiv:2409.12412 - Published 9/20/2024 by Hui Zhong, Di Chen, Pengqin Wang, Wenrui Wang, Shaojie Shen, Yonghong Liu, Meixin Zhu

🖼️

Overview

On-road air pollution varies significantly over short distances due to emission sources, dilution, and chemical processes.
Integrating mobile monitoring data with street view images (SVIs) can help predict local air pollution.
However, algorithms, sampling strategies, and image quality introduce errors due to lack of reliable references.
This study aimed to develop a reliable strategy by employing 314 taxis to monitor pollutants and sample corresponding SVIs.

Plain English Explanation

Air pollution levels can change drastically even over short distances, depending on factors like where pollutants are released, how they mix with the surrounding air, and chemical reactions that occur. Combining real-time air pollution measurements taken by vehicles with images of the surrounding streets could help predict air pollution levels in specific locations. However, the methods used to analyze the data, the way the data is collected, and the quality of the images can all introduce errors if not done properly.

This research tried to find a reliable way to use this approach by having 314 taxis measure [object Object] like nitrogen oxides and particulate matter, and also take images of the streets at the same locations. The researchers then tested different machine learning algorithms and ways of collecting and analyzing the image data to see what worked best for estimating the air pollution levels.

Technical Explanation

The researchers employed a fleet of 314 taxis to dynamically monitor levels of nitrogen oxides (NO, NO2), fine particulate matter (PM2.5), and coarse particulate matter (PM10). At the same locations, they also collected over 382,000 street view images (SVIs) at various angles (0°, 90°, 180°, 270°) and distances (100m, 200m, 300m, 400m, 500m radius buffers).

They tested three machine learning algorithms (random forest, XGBoost, neural network) as well as a linear land-use regression (LUR) model to estimate the pollutant levels based on the SVI features. The researchers identified four common image quality issues (overexposure, blur, underexposure, and incorrect feature identification) and discussed how they affected the predictions.

Overall, the machine learning methods outperformed the linear LUR model, with random forest being the best. Averaging features from multiple-angle SVIs was more effective than using a single angle. The optimal strategy was using 100m radius SVIs and the averaging approach, which achieved estimation errors under 2.5 μg/m^2 or ppb. Image quality issues like overexposure and underexposure led to inaccurate identification of road and human activity features, degrading the air pollution estimates.

Critical Analysis

The paper provides a comprehensive evaluation of using street view images and machine learning to estimate local air pollution levels, addressing important practical considerations around sampling strategies and image quality.

One limitation is the reliance on taxi fleets for data collection, which may not be feasible or representative in all locations. The researchers acknowledge the need for further validation with independent reference data.

Additionally, the impact of seasonal or meteorological variations on the relationships between image features and air pollution was not addressed. Incorporating these factors could improve the robustness of the models.

While the results demonstrate the potential of this approach, the researchers note that further work is needed to develop a fully operational system that can be deployed at scale. Challenges remain in automating the image processing and ensuring consistent data quality across diverse urban environments.

Conclusion

This study showcases a promising method for predicting [object Object] levels by integrating mobile monitoring data with street view imagery. The findings highlight the importance of careful sampling strategies and image quality control to achieve reliable air pollution estimates.

The proposed optimal approach of using 100m radius SVIs and averaging features across multiple angles provides a practical framework that could be further developed and scaled up to support [object Object] and [object Object] in urban environments. This research contributes to the growing field of leveraging computer vision and machine learning for environmental sensing and [object Object].

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🖼️

How to predict on-road air pollution based on street view images and machine learning: a quantitative analysis of the optimal strategy

Hui Zhong, Di Chen, Pengqin Wang, Wenrui Wang, Shaojie Shen, Yonghong Liu, Meixin Zhu

On-road air pollution exhibits substantial variability over short distances due to emission sources, dilution, and physicochemical processes. Integrating mobile monitoring data with street view images (SVIs) holds promise for predicting local air pollution. However, algorithms, sampling strategies, and image quality introduce extra errors due to a lack of reliable references that quantify their effects. To bridge this gap, we employed 314 taxis to monitor NO, NO2, PM2.5 and PM10 dynamically and sampled corresponding SVIs, aiming to develop a reliable strategy. We extracted SVI features from ~ 382,000 streetscape images, which were collected at various angles (0{deg}, 90{deg}, 180{deg}, 270{deg}) and ranges (buffers with radii of 100m, 200m, 300m, 400m, 500m). Also, three machine learning algorithms alongside the linear land-used regression (LUR) model were experimented with to explore the influences of different algorithms. Four typical image quality issues were identified and discussed. Generally, machine learning methods outperform linear LUR for estimating the four pollutants, with the ranking: random forest > XGBoost > neural network > LUR. Compared to single-angle sampling, the averaging strategy is an effective method to avoid bias of insufficient feature capture. Therefore, the optimal sampling strategy is to obtain SVIs at a 100m radius buffer and extract features using the averaging strategy. This approach achieved estimation results for each aggregation location with absolute errors almost less than 2.5 {mu}g/m^2 or ppb. Overexposure, blur, and underexposure led to image misjudgments and incorrect identifications, causing an overestimation of road features and underestimation of human-activity features, contributing to inaccurate NO, NO2, PM2.5 and PM10 estimation.

9/20/2024

Predicting Lung Disease Severity via Image-Based AQI Analysis using Deep Learning Techniques

Anvita Mahajan, Sayali Mate, Chinmayee Kulkarni, Suraj Sawant

Air pollution is a significant health concern worldwide, contributing to various respiratory diseases. Advances in air quality mapping, driven by the emergence of smart cities and the proliferation of Internet-of-Things sensor devices, have led to an increase in available data, fueling momentum in air pollution forecasting. The objective of this study is to devise an integrated approach for predicting air quality using image data and subsequently assessing lung disease severity based on Air Quality Index (AQI).The aim is to implement an integrated approach by refining existing techniques to improve accuracy in predicting AQI and lung disease severity. The study aims to forecast additional atmospheric pollutants like AQI, PM10, O3, CO, SO2, NO2 in addition to PM2.5 levels. Additionally, the study aims to compare the proposed approach with existing methods to show its effectiveness. The approach used in this paper uses VGG16 model for feature extraction in images and neural network for predicting AQI.In predicting lung disease severity, Support Vector Classifier (SVC) and K-Nearest Neighbors (KNN) algorithms are utilized. The neural network model for predicting AQI achieved training accuracy of 88.54 % and testing accuracy of 87.44%,which was measured using loss function, while the KNN model used for predicting lung disease severity achieved training accuracy of 98.4% and testing accuracy of 97.5% In conclusion, the integrated approach presented in this study forecasts air quality and evaluates lung disease severity, achieving high testing accuracies of 87.44% for AQI and 97.5% for lung disease severity using neural network, KNN, and SVC models. The future scope involves implementing transfer learning and advanced deep learning modules to enhance prediction capabilities. While the current study focuses on India, the objective is to expand its scope to encompass global coverage.

5/8/2024

Urban Air Pollution Forecasting: a Machine Learning Approach leveraging Satellite Observations and Meteorological Forecasts

Giacomo Blanco, Luca Barco, Lorenzo Innocenti, Claudio Rossi

Air pollution poses a significant threat to public health and well-being, particularly in urban areas. This study introduces a series of machine-learning models that integrate data from the Sentinel-5P satellite, meteorological conditions, and topological characteristics to forecast future levels of five major pollutants. The investigation delineates the process of data collection, detailing the combination of diverse data sources utilized in the study. Through experiments conducted in the Milan metropolitan area, the models demonstrate their efficacy in predicting pollutant levels for the forthcoming day, achieving a percentage error of around 30%. The proposed models are advantageous as they are independent of monitoring stations, facilitating their use in areas without existing infrastructure. Additionally, we have released the collected dataset to the public, aiming to stimulate further research in this field. This research contributes to advancing our understanding of urban air quality dynamics and emphasizes the importance of amalgamating satellite, meteorological, and topographical data to develop robust pollution forecasting models.

5/31/2024

New!Forecasting Smog Clouds With Deep Learning

Valentijn Oldenburg, Juan Cardenas-Cartagena, Matias Valdenegro-Toro

In this proof-of-concept study, we conduct multivariate timeseries forecasting for the concentrations of nitrogen dioxide (NO2), ozone (O3), and (fine) particulate matter (PM10 & PM2.5) with meteorological covariates between two locations using various deep learning models, with a focus on long short-term memory (LSTM) and gated recurrent unit (GRU) architectures. In particular, we propose an integrated, hierarchical model architecture inspired by air pollution dynamics and atmospheric science that employs multi-task learning and is benchmarked by unidirectional and fully-connected models. Results demonstrate that, above all, the hierarchical GRU proves itself as a competitive and efficient method for forecasting the concentration of smog-related pollutants.

10/4/2024