A Malware Classification Method Using visualization and Word Embedding Features
With the explosive growth of threats to Internet security, malware visualization in malware classification has become a promising study area in security and machine learning. This paper proposes a visualization method for malware analysis based on word embedding features of byte sequences.Based on some assistant information such as word embedding, the basic to a strong malware classification approach is to transfer the learned information from the malware domain to the image domain, which needs correlation modeling between these domains. However, most current methods neglect to model the relationships in an embedding way, ensue in low performance of malware classification. To catch this challenge, we consider the Word Embeddings duty as a Semantic Information Extraction. Our Proposed method aims to learn effective representations of malware families, which takes as input a set of embedded vectors corresponding to the malware. Word embedding is designed to generate features of a malware sample by leveraging its malware semantics. Our results show that visual models in the domain of images can be used for efficient malware classification. We evaluated our method on the kaggle dataset of Windows PE file instances, obtaining an average classification accuracy of 0.9896%.
- حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران میشود.
- پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانههای چاپی و دیجیتال را به کاربر نمیدهد.