Prediction of PM2.5 using a hybrid network (ANN-GA) Case study: Urmia city

Message:
Article Type:
Case Study (بدون رتبه معتبر)
Abstract:
Introduction

For the last 50 years, activities like urbanization, industrialization and population growth, make air as a significant inseparable part of our life. Air pollution can be defined as the presence of chemicals or toxic compounds in the air to extent that they pose a health risk. Emissions from cars, plant chemicals, dust, pollen and mold spores are introduced as particulate matter (PM). The World Health Organization reported that ambient air pollution causes 4.2 million deaths from strokes, heart disease, lung cancer and chronic respiratory diseases. Of the various pollutants affecting air quality, particulate matter smaller than 2.5 microns is the major air pollution problem (Ścibor et al., 2020). As well, there is growing evidence of the effects of PM10 and PM2.5 on cardiovascular disease (CVD) and respiratory disease (DR).Forecasting air pollutants provides an opportunity to determine the intensity of air pollution in different areas and prevent irreversible impacts. In addition, these models also allow decision-makers to make the right decisions and prepare for the prevention or control of the PMs in the future. Some of the models used in air pollution forecasting studies are auto-regressive Integrated Moving Average (ARIMA), artificial neural network (ANN), Community Multiscale Air Quality Model (CMAQ), the Weather Research and Forecasting (WRF) model coupled with Chemistry (WRF-CHEM), Fuzzy models, grey model and/or hybrid models. ANN has been used extensively by scientists to provide rapid and parsimonious solutions to mitigate the negative impacts of air pollution worldwide. Neural networks, as an alternative, have been successfully used in air pollution forecasting and have produced accurate results in time series data. Different types of noise and nonlinear structure were present in the data. Hybrid modeling approaches have a wide variety of applications in which numerous methods or attributes are merged to create a more sophisticated model with superior performance in certain scenarios.Urmia is one of Iran's most polluted cities, owing to continuous traffic and traffic congestion, growing CO2 and PM levels, and a lack of knowledge on regulating and locating industrial manufacturing units. Dust from Iraq affects the region, as well as inversion, which occurs 90 days a year, are instances of region-specific air pollution. In addition, the drying of Urmia Lake, which can result in salt storms, is one of the critical concerns that will lead to significant pollution in the near future.In this study, ANN-GA with missing data imputation was used to predict PM2.5 in Urmia, Iran, in the short-term to demonstrate how data-gap filling and preprocessing methods could improve hybrid models' performance.

Methodology

The concentrations of air pollutants (carbon monoxide, nitrogen dioxide, and sulfur dioxide) as well as meteorological data (temperature, relative humidity, and wind velocity) were used as inputs in this research to predict PM2.5. Air pollution concentrationsand meteorological data over a two-year period were obtained from Monitoring Station No. 3, Urmia municipality, and Iran's meteorology website (Data.irimo.ir).The data was then preprocessed with the Savitsky-Golay filter before being fed into the ANN and ANN-GA networks. Data gaps and imputed data (KNN/SPLINE method) were used as input in each network, and the results were compared.In this study, a single system contains two hidden layers and one output layer. The time series method was used to introduce the data to the network. The data was divided into three parts. 70% of the data is used for training, 15% for validation, and 15% for testing. Data import scenarios were defined in two ways. The first scenario used no imputation, while the second used SPLINE and KNN to fill in data gaps. As a transfer function, a sigmoid (logsig) layer was used for hidden layers, and a linear layer (Purelin) was used for the output layer. The Levenberg-Marquardt algorithm was chosen as the learning algorithm based on the type of problem and the speed of convergence. To improve the results, the number of neurons, repetition parameters, number of permitted evaluations, Levenberg algorithm parameters, and reliability were all adjusted through a trial-and-error process.New ANN-GA network was used in this study and GA was used as a training function. After introducing the data as a time series and selecting the amount of data for each episode of learning, evaluation, and testing, the structure and number of networklayers were created with the "newff" function. The main difference is that the genetic learning process was used instead of the "train" function. It's worth noting that the network layer characteristics in both methods were the same. To learn how to complete the process, the new learning function requires several side processes, including cost function creation, selection, intersection, and mutation. Three methods of roulette selection, tournament selection and random were used in the selection process. To introduce the cost function, weights were taken from those created by the "newff" function. Different values were assigned to the initial population variables, maximum mutation number, and selection pressure coefficient by trial-and-error method. Moreover, two data import scenarios were defined.

Conclusion

Forecasting methods have been considered an important tool in research on air pollution. Among the various pollutants that influence air quality, particles with an aerodynamic diameter of less than 2.5 micrometers (PM2.5) are one of the key issues in air pollution control management. In this study, a model for predicting future concentrations of PM2.5 was developed by the Hybrid Network (ANN-GA). Two methods of data imputation (KNN and SPLINE) were used to minimize training issues and improve network accuracy. PM10, PM2.5, nitrogen dioxide, oxide, carbon monoxide, and weather data were used for predictions. The results show that multi-line neural networks are relatively efficient for predictive purposes but lack sufficient accuracy to predict. The ANN network produced MSE error of 0.023 and coherence coefficient of R 0.543 only with data gap filling methods. In order to improve R and reduce network errors, a genetic algorithm was used in combination with a multi-layer neural network (ANN-GA). As the results showed, MSE and R for hybrid networks (ANN-GA) were improved (R=0.91 and MSE=0.001). In addition, compared to ANN, the R increased by 40 percent and the MSE improved by 95 percent. Thus, it can be concluded that ANN-GA can be used as a powerful and reliable tool for predicting air pollution.

Language:
Persian
Published:
Journal of Environmental Science Studies, Volume:9 Issue: 3, 2024
Pages:
8898 to 8912
magiran.com/p2683540  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!