Application of Genetic Expression Programming (GEP) to produce Intelligence Committee Fuzzy Logic model (CFLM) to predict arsenic concentration in water resources of the Sahand dam basin

Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Identifying and monitoring of the water resources quality in basin have very special importance for quality management of a dam reservoir. Today, most natural waters arepolluted, so monitoring the distribution of pollutants in surface can control and reduce water pollution and its effects. Having such information is possible only through different analysis and pollution monitoring stations distributed across the study area. Arsenic is considered as one of the most important pollutants due to its high toxicity. Natural water pollution caused by geological resource cannot be eliminated or prevented to be spread simply; therefore, it should be evaluated carefully. Various reports in recent years indicated the presence of arsenic, anomaly with concentration more than the international standard (0.01 mg/L), in the water resources of the Sahand Dam basin which provides agriculture, industry and drinking water demands of the area. Hence, Geology Department of Tabriz University and East Azerbaijan Regional Water Authority have attempted to sampling and chemical analysis of surface water and ground water resources. Groundwater models may use for optimization by one parameter or combination of optimizations, simulation of pollutions and their management. Previous research showed a lack of adequate geostatistical linear models for predicting the total arsenic (III,V) concentration in the study area, so artificial intelligence models such as gene expression programming (GEP) and fuzzy logic (FL) models were used, inspired by nature with ability to estimate the parameters of the natural phenomena with significant accuracy compared to other methods. The number of 60 and 20 data of the hydrochemical parameters that have the highest correlation with arsenic, was used, respectively, in the training and testing level. These parameters Including pH, 〖"SO" 〗_"4" ^"2-" , 〖"NO" 〗_"3" ^"_" , F, Fe and As used as input parameters for Mamdani fuzzy logic (MFL), Larsen fuzzy logic (LSL) and Sugeno fuzzy logic (SFL) to estimate the total arsenic concentration. Fuzzy system has three main level, including: 1) fuzzification of data by defining the membership function; 2) communication of input and output by such as if-then rules; and 3) aggregation of system results and defuzzification by the fuzzy operator such as or/ and/ not. Each of the fuzzy models has its own advantages and uncertainty that can be used of the individual benefits. As the results of three fuzzy models are similar, the genetic expression programming model has been used to produce committee fuzzy logic model (CFLM). This theory is based on that the combination of models results achieves a better overall result. Up to now, several studies have taken using different methods of artificial intelligence that have demonstrated excellence in GEP methods. GEP with genetic algorithms is Like the GA and GP that uses individuals of the population and select them based on fitting and using with one or more genetic operator applied genetic changes on them. The search process is done with random, generated a series of trees that is leading to the production of expression tree. This process continues to the maximum number of replications or specific error function.
Fuzzy model by determining the optimal radius of 0.4, based on the lowest RMSE, were accomplished. The data were divided into 8 categories, and 7 if-then rules were determined. The fuzzy membership functions used for modeling of the arsenic values were Gaussian that was fitted to classified data. The output membership function of Sugeno model was linear, made based on the inputs. FCM clustering method was used in Mamdani and Larsen model. In this model, optimal number of 12 categories were determined based on minimum RMSE equal to 0.11 and 0.12 mg/L, respectively with input and output membership function of Gaussian type. The values of R2 for training level of Mamdani and Larsen model were in order 0.94 and 0.91, respectively. All three fuzzy models had acceptable results, but Mamdani model results were relatively better than the two others. Because of each these models own its performance, for simultaneous use of advantage of all, the committee model was used. All output data of three fuzzy models was used as input data in GEP model and also was selected in such a way that the minimum and maximum of the data be entered on the testing level. Production of the initial population of the program was done by selecting the number 20 chromosome with the head size of 7, 3 number of genes, and 2 constant per gene. The mathematical operator of was selected for the linking function between subtrees. To compare the results in the program, three sets of the function were used as the main operators. F3 function, includes default operators, was selected as the major functions in the program and the best fitted compared to other functions. GEP model by providing the relationship between input and output, and more accurate results in the training and testing levels with R2, 0.97 and 0.92, respectively, was evaluated as the most appropriate model to estimate the arsenic values in the region.
In this study, GEP with practical features and gene expression tree production provided the possibility of evaluating complicated and non-linear models. Also, the genetic programming model provided explicit solutions with high accuracy basis on which can be determined the relationship between input and output variables. With regard to the suitability and similarity of three fuzzy model results, the genetic expression programming was used for the production committee model of results of three single models. Considering the benefits of genetic expression programming, the mentioned model is able to present a committee model with more accuracy than three single fuzzy models. Due to lack of proper accountability of spatial statistical models to estimate the arsenic in the study area, the proposed model can be appropriate in the exact determination.
Language:
Persian
Published:
Iranian Water Research Journal, Volume:11 Issue: 27, 2018
Page:
85
magiran.com/p1859295  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
دسترسی سراسری کاربران دانشگاه پیام نور!
اعضای هیئت علمی و دانشجویان دانشگاه پیام نور در سراسر کشور، در صورت ثبت نام با ایمیل دانشگاهی، تا پایان فروردین ماه 1403 به مقالات سایت دسترسی خواهند داشت!
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!