Evaluating and Forecasting the Probability of Lightning Occurrence in Rasht City

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:

Lightning is one of the most severe weather hazards that will cause significant economic, social and environmental damage each year. The prediction of a lightning is a very difficult task due to the spatial and temporal expansion of weather either physically or dynamically. Therefore, timely forecasting of lightning and evaluation of the best data mining model is effective in reducing damage. In this research, the data of the years 2012_2018 of the Meteorological Station of Rasht were used, including dependent variable of occurrence and non-occurrence of lightning during 7 years and independent variables of factors affecting lightning including temperature, relative humidity, cloudy, wind speed, wind direction, pressure air and Previous day's lightning. After preprocessing and processing data, data mining models including Classification & Regression Tree (CART), Chi-squared Automatic Interaction Detector (CHAID), Induction of Decision Trees (C5) and neural networks Radial Basis Function (RBF), Multi Layer Perceptron (MLP) and Support Vector Machine (SVM) were used in Spss Modeler Ver 20 software. The results of the models were compared with the Comparative Criteria and the Receiver operating characteristic (ROC) curve. According to the results of the models, the probability of lightning occurrence is higher in the months of May, June and July than in other months and the rate of occurrence from spring to winter has a decreasing trend, while in winter it is at least. CHAID tree with a specificity rate of 0.794 and a minimum false positive rate of 0.205 and the SVM model with a correct prediction of 0.773 and an error rate of 0.475 and precision of 0.855 have optimum performance compared with other models. 

Introduction

Lightning is the ionization of the atmosphere due to the increased potential difference between the cloud and earth and the rapid discharge of electricity in the form of light and sound waves. Increasing the intensity of Lightning lead to thunderstorms, heavy rain, floods and tornadoes. Rasht is the largest rice-growing city in the country and produces 11% of the required rice in country. In recent years, lightning accidents such as rice stalk sleeping and the risk of paddy disease, roads blocked due to floods, traffic congestion, damage to buildings and the falling bridge and mortality from the electric shock have doubled the importance of predicting lightning in the future. The main purpose of this study was to use recorded ground data from the occurrence and non-occurrence of lightning (binary data) and the effect of related meteorological parameters (temperature, relative humidity, cloudy, wind speed, wind direction, pressure air and previous day's lightning) to estimate the probability of lightning occurrence in future using data mining (trees and neural network models) and evaluate and determine the optimal model to reduce future damage.                                                                                              

Materials and Methods

In this study, binary data of lightning and atmospheric parameters (temperature, relative humidity, cloudy, wind speed, wind direction, pressure air and previous day's lightning) were obtained from Rasht Meteorological Station during the years 2012-2018. Then according to Eq. (1) the data were normalized between zero and one and data classes were balanced using RUS and ROS algorithms in Rapid Miner software. Xn=X-Xmin / Xmax-Xmin  Eq. (1)the process of changing variables with determination of statistical properties and correlation was performed using SPSS software to reduce the errors. Finally, SPSS Modeler software was used to predict occurrence and non-occurrence of lightning in future using by CART, CHAID, C5 trees and Multi Layer Perceptron (MLP), Radial Basis Function (RBF) and Support Vector Machine (SVM). In this research, the training data set contains 70% of the data and testing data set contains 30% of the data. Then, based on the relations (2 – 9) the results of the models output were evaluated with interpolation matrix, comparative criteria and ROC curve.                
 Eq. (2) Accuracy=TP+TN/ TP+TN+FP+FN
Precision=TP / TP+FP Eq. (3)
Sensitivity=TP / TP+FN Eq. (4)                                                    
Harmonic Mean=2*P*S / P+SEq. (5)                                     
Specificity=TN / TN+FPEq. (6)
False Positive Rate= FP / FP+TNEq. (7)
False Negative Rate= FN / FN+TPEq. (8)
RMSE= √1/N Ʃ(P-O)2Eq. (9)
Where, O signifies the observed value, P denotes the predicted value, TN indicates the true negative rate, FP indicates the false positive rate, FN shows the false negative rate, TP shows the true positive rate and N signifies the number of data.   

Results and Discussion

Lightning is one of the most important environmental hazards. Data mining technique is a suitable method to predict lightning. The results show that prediction using data mining technique is possible and effective. Based on the results, the probability of lightning occurrence is the highest in spring (May and June) and summer (July); it is minimized in winter and has a decreasing trend. Therefore, the probability of lightning occurrence in the future is higher than non-occurrence of lightning. Besides, among the three tree, CART, CHAID and C5, the CART and C5 trees had less satisfactory indices lacking the highest accuracy and precision in predicting lightning in future. Whereas the CHAID tree in 0.76 cases made a correct prediction with 0.85 precision and predicted the occurrence of lightning rate to be 0.54, which is very similar to the real value 0.62, and among the network artificial models Support Vector Machine (SVM) model with maximum utility with prediction of 0.77 accuracy and precision of 0.85 and prediction of 0.60 probability of lightning occurrence have priority and superiority than Radial Basis Function (RBF) and Multi-Layer Perceptron (MLP) models. According to the classification and Area Under Roc Curve (AUC) among the trees, the CHAID tree with 0.829 value and the Support Vector Machine model with 0.853 value have superiority. The numerical results are obtained and the similarity of this prediction with real values ​​shows that trees and network artificial are effective in predicting the probability of lightning occurring in the future and the CHAID tree and Support Vector Machine model have optimal performance compared with other models showing better predictability.                                                                                                         

Conclusion

According to the results of the model outputs, it was found that the probability of lightning occurring in Rasht city is very high. The models show the probability of lightning occurring in April has the same trend but the maximum lightning occurred in spring (May and June) due to unstable weather conditions and summer (July) is more than autumn and winter. Besides it has a decreasing trend, from spring to winter which is minimized in winter. From the evaluation of the CHIAD tree and the Support Vector Machine model, the Support Vector Machine model with a slight difference in utility indices of accuracy = 0.773, precision = 0.855, harmonic mean = 0.813, root mean square error = 0.475.  False negative rate = 0.198 was identified as the optimal model in predicting lightning in future and with respect to reliable outputs with maximum accuracy, precision and least prediction error, the Support Vector Machine model has a good performance which can be used to forecast the probability of  lightning occurrencein Rasht City. Also, according to the results of the models, the effective parameters to occurrence of lightning in order of Importance are previous day's lightning, temperature, pressure air, relative humidity and cloudy; other parameters are less important. Using data mining techniques and predictingprobability of lightning occurrence in future use by Support Vector Machine model, as a model with most accurate and precision, provides more accurate meteorology and the more effective actions to reduce future damage.

Language:
Persian
Published:
Geography and Sustainability of Environment, Volume:10 Issue: 34, 2020
Pages:
21 to 35
magiran.com/p2139152  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
دسترسی سراسری کاربران دانشگاه پیام نور!
اعضای هیئت علمی و دانشجویان دانشگاه پیام نور در سراسر کشور، در صورت ثبت نام با ایمیل دانشگاهی، تا پایان فروردین ماه 1403 به مقالات سایت دسترسی خواهند داشت!
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!