Dealing with the hazards caused by the concentration of pollutants PM2.5 by using regression methods and spatio-temporal similarity in order to impute the missing values in their time series (Case study of Tehran)
Due to the adverse effects of pollutants on the environment and human health, the analysis of air quality data has an important role in protecting the environment and tackling air pollution problems. Missing data in time series, especially air pollution data, Cause a particular challenge to the analysis of these data, which show the importance of using methods known as imputation in order to deal with this phenomenon. Missing values reduce the volume of data, change time patterns in data and make inaccurate conclusions in data analysis. In this study, in order to estimate the missing values in time series data of PM_2.5 pollutant concentration from 12 contamination stations in Tehran, a hybrid algorithm based on regression algorithm considering spatial and temporal similarities and dependence by dynamic time wrapping algorithm is presented. Data with missing values with a pattern similar to the original data were simulated in the interval of 10, 15 and 20% missing in data, with the aim of evaluating the performance of the single and multiple imputation models. Then the proposed method in combination with different multiple imputation methods such as classification and regression tree, random sample and predictive mean matching, have been implemented and results have been compared with single imputation methods. Implementation results indicated the superiority of the proposed method combined with regression tree and linear interpolation compared to other methods of multiple and single imputation
- حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران میشود.
- پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانههای چاپی و دیجیتال را به کاربر نمیدهد.