Multivariate inlier and outlier data detection by using of data mining algorithms Case study: Geochemical data at 1:100000 Roum sheet in South Khorasan
In this paper, four data mining algorithms, namely, kernel density estimation, local outlier factor, OPTICS-OF and SVDD are used to determine multivariate outlier data. So, stream sediment geochemical data, in 1:100000 Roum sheet, with 902⨉41 matrix dimensions have been utilized. Replacing censored data, converting the data set to an open number system and finally standardizing them are used as pre-processing methods. Results show that in error sample detection approach, 10 samples that have the highest outlier probability, and are present in equal numbers in the four mentioned algorithms, can be considered for more study as replicate sampling. In non-normal sample detection approach, form 150 selected samples, 74.5% of samples are detected as outliers in the four mentioned algorithms, and 16.1 and 9.4 percent are recognized as the outlier data in one and two of the aforementioned algorithms, respectively. Determining of replicate sampling, calculating location and scatter matrices in multivariate robust statistics after eliminating non-normal samples and geochemical anomaly detection are suggested as the applications of these algorithms.
- حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران میشود.
- پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانههای چاپی و دیجیتال را به کاربر نمیدهد.