Ensemble Bayesian Classification Using Genetic Algorithm Wrapper Feature Selection in Spam Detection

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
The role of email in communication is seriously threatened by a phenomenon called spam. So far, many methods have been proposed to deal with this phenomenon, one of the most important of which is to classify emails based on their content into two categories: spam and non-spam. Content-based classification mechanisms use the words as features, where applying an efficient feature selection mechanism is critical due to the large number of features. Therefore, the main focus of this paper is to select useful features via proposing a wrapper feature selection approach based on a powerful genetic algorithm. We then apply a Bayesian classifier, which has demonstrated a high efficiency in text classification. The main steps of the proposed method is as follows: first, an initial feature vector is chosen, then it is optimized by multiplying the vector in a matrix called the transformation matrix made by the genetic algorithm, and finally, a set of k feature vectors is generated. An ensemble classification approach composed of k Bayesian classifiers is applied to the feature vectors, and the ultimate class label is determined by voting among ensemble members. The proposed method is implemented on two datasets PU1 and PU2. The results show that the classification accuracy of the proposed method with k=7 reaches 87.86 and 87.91 in PU1 and PU2, rspectively. The results also indicate the efficiency of the proposed method compared to naïve Bayes and two well-known classifiers SVM and KNN.
Language:
Persian
Published:
Information management, Volume:6 Issue: 2, 2021
Pages:
250 to 277
magiran.com/p2332917  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!