Modeling the Vertical Soil Calcium Carbonate Equivalent Variation by Machine Learning Algorithms in Qazvin Plain

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Introduction

Calcium Carbonate Equivalent (CCE) is one of the key soils properties in arid and semi-arid regions. The study of spatial variability of surface and subsurface layers is important in the sustainable land management of arable soils. This study aimed to model the spatial distribution of CCE percentage by using three machine learning algorithms including Random Forest (RF), Decision Tree regression (DTr) and k-Nearest Neighbor (k-NN) at five standard depths of 0-5, 5-15, 15-30, 30-60, and 60-100 cm.

Material and Methods

The study area with 60,000 ha includes the major part of the lands of Qazvin plain located on the border of Qazvin and Alborz provinces. Field and laboratory surveys included 278 representative profiles were excavated, described by the horizon, and determined physicochemical properties. The studied soils have a very high diversity in soil moisture (Aridic, Xeric, and Aquic) and temperature regimes (Thermic). These variations have led to the formation of eight great groups of soils in the region based in the USDA soil classification system with the three classes of Haploxerepts, Calcixerepts, and Haplocalcids were the dominant soil classes in the study area. A total of 22 environmental covariates, including 12 variables extracted from the primary and secondary derivation of digital elevation model (DEM), six remote sensing (RS) indicators, two climatic parameters, and two soil covariates were prepared, and then the most appropriate environmental covariates were selected using principal component analysis (PCA) and expert knowledge. The CCE percentage data were randomly divided into two parts, 80% for training and 20% for testing, which was then modeled by three machine learning algorithms RF, DTr, and k-NN, and were evaluated by some statistical indices as coefficient determination (R2), root mean square error (RMSE) and Bias.

Results and Discussion

The results of harmonizing the CCE values at the genetic horizons with the standard depths showed the high efficiency of the spline depth function in providing an acceptable estimate with minimum error and maximum agreement between observed and predicted values. The PCA method showed that the first to fifth components with the explanation of more than 80% of cumulative variance were Multi-Resolution Index of Valley Bottom Flatness (MrVBF), Mean Annual Temperature (MAT), Greenness index (Greenness), Probability of Calcic horizon (Cal.hr), and Wind Effect environmental covariates which had the highest eigenvalues. Besides, Clay was selected on expert knowledge-based. The relative importance (RI) of the environmental covariates showed the spatial distribution of CCE were affected by Clay with an explanation of more than 57%, 41.8% and 45% of its variance at three surface depths of 0-5, 5-15, and 15-30 cm, while the Cal.hr covariate had the highest impact in the spatial prediction of CCE compared to other predictors as auxiliary variables with 67.8% and 52.8% justification, respectively, at two depths of 30-60 and 60-100 cm. Hence, using the calcic horizon probability Map (Cal.hr) as a derivative soil factor made it possible to produce more appropriate final maps, while preventing the reduction of the accuracy of the modeling results in the subsoils. The auxiliary variable of remote sensing, i.e., Greenness, could not show a significant impact on the expression of the variation of CCE percentage at all studied depths. Unlike remote sensing indices, the topographic attribute of the MrVBF, at two standard depths of 0-5 and 5-15 cm, the MAT at a depth of 15-30 cm, and the Wind Effect at the standard depths 30-60 and 60-100 cm, after the soil covariates, were the most effective in justifying the spatial variations of CCE%. RF algorithm with a range of R2 values of 0.83 - 0.76 and RMSE of 2.14% - 2.21% resulted in the highest accuracy and minimum error. Even though the DTr method presented R2 values (0.52-0.39) weaker than the RF in the validation dataset, in general, the results of its spatial predictions were similar to the RF model from the surface to the subsurface and more stable than the k-NN. Against RF and DTr, k-NN couldn’t display acceptable performance in the prediction of CCE% at all standardized depths.

Conclusion

In general, it is necessary to understand the spatial distribution of CCE due to its effect on soil moisture accessibility and plant nutrient uptake. Therefore, in the present study, we tried to introduce the RF machine learning algorithm as a superior model with environmental variables that were selected by PCA and the expert knowledge variable selection method. The maps prepared by this approach have an acceptable level of reliability for agricultural and environmental management by managers, soil experts, and farmers.

Language:
Persian
Published:
Journal of water and soil, Volume:35 Issue: 5, 2021
Pages:
719 to 734
magiran.com/p2369640  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!