Classification of Persian news text with logistic regression algorithm

Message:
Article Type:
Research/Original Article (بدون رتبه معتبر)
Abstract:

Due to the ever-increasing amount of data, the amount of textual data is also growing at a high speed. Extracting information from these textual data is one of the necessities of today's information-based world. Text classification is one of the methods of obtaining information from this massive data. In this research, using a standard dataset of Persian news, which included five features in more than 86 thousand news, we investigated the performance of the logistic regression algorithm in the classification of Persian text and also compared it with other similar works. Considering the steps of creating a text category, we have explained the method used in the vectorization section and also stated the importance of the pre-processing section, especially the method used in tagging and converting sub-tags to main ones. In the final evaluation, by changing the algorithm's parameters and modifying the news tags, we reached the desired result of 95% in the accuracy criterion for the text classification of the Persian news dataset.

Language:
Persian
Published:
Journal of New Achievements in Electrical, Computer and Technology, Volume:3 Issue: 6, 2023
Pages:
24 to 37
https://www.magiran.com/p2576283