Modeling the Flood Hazard Potential in the Aji Chai basin using Data Mining Algorithms

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Introduction

Due to the large area and receiving adequate rainfall during the cold and spring seasons, the Aji Chai River and its tributaries become flooded with the beginning of the spring season. In the current study, an attempt was made to prepare a flood hazard potential map in the Aji Chai basin using data mining algorithms. For this purpose, 18 effective parameters in flood occurrence were used. The investigated parameters were Elevation, Slope, Aspect, Topographic wetness index, Sediment transport index, Stream power index, earth curvature, Rainfall, Normalized Difference Vegetation Index, land use, Distance to dam, Distance to bridge, Distance to the river, River density, hydrological soil groups, Drainage texture, Geomorphology and lithology. The information layers of all parameters were prepared in raster format in the ArcGIS software.

 Research Methodology

The study area of ​​the current study is the Aji Chai basin, which is located in East Azerbaijan province in terms of political divisions. This basin is located in the east of Lake Urmia and its area is about 10985.9 Km2. The elevation changes of the basin are from 1255 meters at the outlet of the basin to 3816 meters in the slopes of Sablan Mountain. The most important river that drains the surface water of this basin is Aji Chai. Four data mining algorithms including Random Forest, Random Subspace, Rotation Forest, and Dagging were used to achieve the purpose of the research. To implement the research models, the location of 274 flood points that happened in the past was used. The map of the location of the flood points in the area was prepared through the information of the regional water company of East Azerbaijan province, field survey, and also the Landsat 8 satellite image of the OLI-TIRS sensor. The implementation steps of all models have been done in WEKA data mining software. WEKA software has been introduced as a machine learning software for the first time in New Zealand and at the University of Waikato. In this research, in order to evaluate the accuracy of flood risk potential maps, receiver operating characteristic curve or system performance characteristic curve (ROC) and area under the curve (AUC) have been used. In the ROC curve, the X-axis shows the detection value or specificity (the percentage of non-flooded pixels that are correctly classified as non-flooded) and the Y-axis shows the sensitivity value (the percentage of flood pixels that are correctly classified as flooded). Variance inflation factor (VIF) and tolerance (T) indexes were used to determine multiple collinearity between independent variables. The presence of collinearity between the selected parameters causes the final maps to be of low accuracy.

Results

The results of multiple collinearity analysis showed that except for the drainage texture parameter, other independent variables selected to prepare flood hazard potential maps have low collinearity. Therefore, 17 parameters have been used in flood hazard modeling using data-mining algorithms. The results of examining the importance of each of the parameters in the process of implementing data mining algorithms showed that in the Random subspace model, the parameters of the elevation classes, slope, distance to river and lithology were the most important, respectively. In the Dagging model, the most important effective factors were: Elevation, hydrological soil groups, Topographic wetness index, Distance to the river and Normalized Difference Vegetation Index. In the Rotation Forest model, the parameters of lithology, slope, Rainfall, Elevation and Distance to the river were the most important factors, respectively. The most important effective factors in the Random Forest (RS) model were: Elevation, Distance to the river, slope and Rainfall, respectively.

 Discussion & Conclusions

Flood hazard potential maps were prepared based on data mining algorithms in the ArcGIS software environment and in five classes with the title of very low, low, moderate, high and very high potential. Examining the final maps shows that the spatial distribution pattern of hazards zones is the same in all maps. So that in all the maps, the heights and steep slopes have very low potential. The distribution of flood points in the classes of the slope map shows that more than 80% of the floods occurred on the slopes of 0-10%, which indicates the effect of this factor on the floods of the region. Examining the area of ​​each of the flood hazard classes in the final maps obtained from data mining algorithms shows that in all maps, more than 30% of the area of ​​the basin are located in high and very high classes. The evaluation of the accuracy of the models using the ROC curve and the area under the curve showed that Random forest model has performed better than other models with the AUC 0.94.

Language:
Persian
Published:
Environmental Erosion Researches, Volume:14 Issue: 4, 2024
Pages:
19 to 38
https://www.magiran.com/p2811631  
سامانه نویسندگان
  • Rahimpour، Tohid
    Corresponding Author (1)
    Rahimpour, Tohid
    Researcher Geomorphology, University Of Tabriz, Tabriz, Iran
  • Rezaei Moghaddam، Mohammad Hossein
    Author (2)
    Rezaei Moghaddam, Mohammad Hossein
    Professor Geomorphology, Dept of Geomorphology, Faculty of Planning and Environmental Sciences, University Of Tabriz, Tabriz, Iran
اطلاعات نویسنده(گان) توسط ایشان ثبت و تکمیل شده‌است. برای مشاهده مشخصات و فهرست همه مطالب، صفحه رزومه را ببینید.
مقالات دیگری از این نویسنده (گان)