Reverse engineering of network applications especially from the security point of view is of high importance and interest. Many network applications use proprietary protocols which specifications are not publicly available. Reverse engineering of such applications could provide us with vital information to understand their embedded unknown protocols. This could facilitate many tasks including deep protocol inspection in next generation firewalls and analysis of suspicious binary codes.
The goal of protocol reverse engineering is to extract the protocol format and the protocol state machine. The protocol format describes the structure of all messages in protocol and the protocol state machine describes the sequence of messages that the protocol accept. Recently, there has been rising interest in automatic protocol reverse engineering. These works are divided into activities that extract protocol format and activities that extract protocol state machine. They can also be divided into those uses as input network traffic and those uses as input program implements the protocol. However, although there are some researches in this field, they mostly focused on extracting syntactic structure of the protocol messages.
In this paper, some new techniques are presented to improve extracting the format (both the syntax and semantics) of protocol messages via reverse engineering of binary codes of network applications. To do the research, an integration of dynamic and static binary code analysis are used. The field extraction approach first detects length fields and separators and then by applying rules based on compiler principles locates all the fields in the messages. The semantic extraction approach is based on the semantic information available in the program implements of the protocol and also information exists in the environment of the program.
For evaluating the proposed approach, four different network applications including DNS, eDonkey, Modbus, and STUN were analyzed. Experimental results show that the proposed techniques not only could extract more complete syntactic structure of messages than similar works, but also it could extract a set of advantageous semantic information about the protocol messages that are not achievable in previous works.
- حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران میشود.
- پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانههای چاپی و دیجیتال را به کاربر نمیدهد.