Statistical Analysis and Forecast Modeling of PM2.5 Concentration Using Artificial Intelligence Based on Machine Learning in Mashhad (2016-2022)

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Background and Purpose

 This study aims to forecast PM2.5 concentrations using four non-linear Machine Learning (ML) models.

Materials and Methods

The ML techniques employed include Light Gradient Boosting Machine (LGBM), Extreme Gradient Boosting Regressor (XGBR), Random Forest (RF), and Gradient Boosting Regressor (GBR). Meteorological and pollutant data were collected to predict the Air Quality Index (AQI) in Mashhad, Khorasan Razavi Province, Iran, for the period from 2016 to 2022.

Results

The ML models performed exceptionally well in predicting PM2.5 concentrations, with approximately 95% of their predictions falling within a factor of the observed values. Additionally, the predicted PM2.5 concentrations were compared with observed values to assess prediction accuracy. Among the four ML models, GBR demonstrated the best performance, achieving high accuracy metrics, including a coefficient of determination (R²) of 0.9802, a mean absolute error (MAE) of 0.54, a mean squared error (MSE) of 5.33, a root mean squared error (RMSE) of 2.31, and a mean absolute percentage error (MAPE) of 1.9%.

Conclusion

This study proposes a high-accuracy PM2.5 prediction method using ML, which can be beneficial for global air quality monitoring and improving acute exposure assessments in epidemiological research. Open Access Policy: This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/

Language:
Persian
Published:
Journal of Research in Environmental Health, Volume:10 Issue: 4, 2025
Pages:
22 to 35
https://www.magiran.com/p2835666