Tabriz Daily Rainfalls Modeling via Hybridized Tree Based and Seasonal-Trend Component Bagging Method

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Introduction

Precipitation is one of the most important components of water cycle. Accurate precipitation measurement is essential for flood forecasting and control, drought analysis, runoff modeling, sediment control and management, watershed management, agricultural irrigation planning, and water quality studies. Determining the correct amount of precipitation in cities and rural areas is also important for managing floods. The precipitation process is completely non-linear and involves randomness in terms of time and space. Therefore, it is not easy to explain that with simple linear models due to various climatic factors and may contain major errors. Therefore, various methods and models have been proposed to evaluate, and predict precipitation. This study aimed to estimate the daily precipitation of Tabriz based on hybridized tree-based and Bagging methods by using neighboring stations.

Materials and Methods

In the present study, the rainfall data of adjacent stations in Urmia lake basin (Sahand, Sarab, Urmia, Maragheh and Mahabad) were employed in 1986-2021 to estimate the daily rainfall in Tabriz. About 70% of data were considered for calibration and 30% of data were applied for validation. Using the correlation matrix and Relief algorithm, various input components were identified. Modeling was performed using tree-based data mining methods including M5P, RT and REPT and Bagging method. The daily precipitations of Tabriz was decomposed into their components by seasonal-trend analysis method. Its components, including trend, seasonal and residual, were used in different input scenarios to investigate the effect of these components on improving the modeling results. To evaluate the modeling performance, the indices of correlation coefficient, Root Mean Square Error, Nash-Sutcliffe Efficiency and modified Wilmot coefficient were applied.

Results and Discussion

RT and REPT methods increased the accuracy of the model and decreased its error when they were used as the basic algorithm of the Bagging method. This was not the case with the M5P method, as the results were slightly weaker. It was also observed that Tabriz rainfall is largely influenced by Sahand rainfall, as the most models gave reliable estimates by using the rainfall data for Sahand station. This can be explained by the high correlation between Tabriz rainfall and Sahand. The results showed that the first scenario (Sahand) for M5P, RT, REPT and B-M5P method, the fifth scenario (Sahand, Sarab, Urmia, Maragheh and Mahabad) for the B-RT method, and the fourth scenario (Sahand, Sarab, Urmia and Mahabad) for the B-REPT method were the best scenarios. The best performance was found for the scenario 1 of the M5P decision tree model, followed by the Bagging method with the M5P base algorithm. In general, it was concluded that application of the Bagging method produced reliable results. Modeling without considering the decomposition components was compared with modeling with decomposition components. Adding seasonal, trend and residual components to the modeling input combinations significantly improved the accuracy of the results. Application of Bagging method in most cases also increased the modeling accuracy. The first scenario (Sahand and residual) for M5P and B-M5P methods, the tenth scenario (residual, trend, seasonal, Sahand and Sarab) for RT, REPT and B-REPT methods, and the eighth scenario (residual, trend and Sahand) for B-RT method were selected as the best scenarios. As a result, among the stations, Sahand, due to proximity and high correlation, and Sarab, due to greater correlation, had a great impact on precipitation in Tabriz. In general, the Bagging method with the basic M5P algorithm (B-M5P) was best suited in the first scenario. Thus, adding precipitation analysis components and using the Bagging method improve the modeling results with tree-based data mining methods.

Conclusion

Our results showed that Bagging method provided acceptable results in most cases. In the first case, the first scenario of M5P method including Sahand precipitation data was selected as the superior method and scenario. As a result, Sahand was the most effective station in estimating Tabriz rainfall with the highest correlation and the shortest distance from Tabriz. In the second case, with the decomposition components, the accuracy of the results increased significantly. The Bagging method with the basic M5P algorithm, the parameters of Sahand precipitation and the residual of Tabriz precipitation was considered as the best modeling algorithm. It can be concluded that using Bagging method and decomposition components with the closest station to the studied station results in the highest accuracy. Therefore, Bagging models with tree-based algorithm can be considered as simple and widely used methods.

Language:
Persian
Published:
Journal of water and soil, Volume:36 Issue: 3, 2022
Pages:
407 to 420
magiran.com/p2485286  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!