Extractive Automatic Text Summarization using integrated set of algorithms and Sa-TRB method

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:

Extractive summarization of text is an essential technique in natural language processing, which helps to produce compact versions of text by extracting the most important sentences. Since the task of shortening and summarizing a text document is time-consuming and exhausting, an automatic system for creating these short versions of the text seems necessary. In extractive summarization, sentences that contain useful and relevant information are usually selected for the final summary. In order to identify these sentences, there are different algorithms, the performance and summary created by each one is different based on the type and scope of the text and the size of the required summary. In this article, a method called Sa-TRB is presented, which is derived from two algorithms, TextRank and BERT, and in addition to using these two methods, it also uses the common sentences created by other algorithms to achieve high accuracy in selection. Have final summary sentences. The most important criterion for evaluating the performance of algorithms is the quality of their final summary, so the more the final summary created by these algorithms is similar to the summary created by humans, the better the quality of the created summary is. ROUGE criteria have been used to obtain the size of this similarity. Finally, by conducting experiments on the cnn-dailymail dataset with different sizes of summaries, it is shown that the proposed method, by increasing the size of the required summaries, despite the decrease in the recall criterion, has accuracy, score and, as a result, higher quality of the final summaries. So, in the last two tests, the score of the proposed method has reached 24.68 and 23.34%, which is almost one percent better than the best tested methods.

Language:
Persian
Published:
Journal of Applied and Basic Machine Intelligence Research, Volume:1 Issue: 2, 2023
Pages:
145 to 159
https://www.magiran.com/p2661112  
سامانه نویسندگان
  • Feizi Derakhshi، Mohammad Reza
    Author (2)
    Feizi Derakhshi, Mohammad Reza
    Professor Computer Engineering, University Of Tabriz, Tabriz, Iran
اطلاعات نویسنده(گان) توسط ایشان ثبت و تکمیل شده‌است. برای مشاهده مشخصات و فهرست همه مطالب، صفحه رزومه را ببینید.
مقالات دیگری از این نویسنده (گان)