Optimizing the organization of Persian text documents using clustering technique

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
The present study aimed to Designing a method for organizing Persian text documents using the clustering technique. The data set related to Theses and Dissertations including 2943 researches was considered as a statistical population. Data were collected from a set of data related to scientific research, which included 5,000 researches in Excel format. In this study, after converting the data into a structured format, the processing operation was performed using preprocessing operations. In the processing stage, the clustering technique was used to present the proposed algorithm in order to organize Persian text documents. This algorithm was introduced by improving the K-means algorithm for document clustering. The results of the evaluation showed that the proposed algorithm based on external criteria had a positive effect on the clustering quality of documents compared to the two algorithms K-means and K-means++. So that the research of each designated category in the related subject cluster had a uniform distribution, and led to the achievement of the purpose of the present study. In the category / cluster tables obtained from the two algorithms K-means and K-means++, we saw a non-uniform distribution of research in clusters, so the evaluation based on internal criteria was affected by different cluster densities and inter-cluster similarity. The size of the dataset was also not affected by the proposed solutions for selecting the final dataset and the research process, so the proposed algorithm works well for the high dimensions of the feature.
Language:
Persian
Published:
Journal of Information Processing and Management, Volume:38 Issue: 3, 2023
Pages:
981 to 1010
magiran.com/p2555088  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!