Presenting a Thematic Model of Health Scientific Productions Using Text-Mining Methods

Author(s):

Mahboobeh Shokouhian , Asefe Asemi* , Ahmad Shabani , Mozaffar Cheshmesohrabi

Message:

Article Type:

Research/Original Article (دارای رتبه معتبر)

Abstract:

With the proliferation of the Internet and the rapid growth of electronic articles, text categorization has become one of the key and important tools for data organization and management. In the text categorization, a set of basic knowledge is provided to the system by learning from this set, the new input documents into one of the subject groups. In health literatures due to the wide variety of topics, preparing such a set of early education is a very time consuming and costly task. The purpose of this article is to present a hybrid model of learning (supervised and unsupervised) for the subject classification of health scientific products that performs the classification operation without the need for an initial labeled set. To extract the thematic model of health science texts from 2009 to 2019 at PubMed database, data mining and text mining were performed using machine learning. Based on Latent Dirichlet Allocation model, the data were analyzed and then the Support Vector Machine was used to classify the texts. In the findings of this study, model was introduced in three main steps. In the first step, the necessary preprocessing was done on the dataset due to the elimination of unnecessary and unnecessary words from the dataset and increasing the accuracy of the proposed model. In the second step, the themes in the texts were extracted using the Latent Dirichlet Allocation method, and as a basic training set in step 3, the data were backed up by the Support Vector Machine algorithm and the classifier learning was performed with the help of these topics. Finally, with the help of the categorization, the subject of each document was identified. The results showed that the proposed model can build a better classification by combining unsupervised clustering properties and prior knowledge of the samples. Clustering on labeled samples with a specific similarity criterion merges related texts with prior knowledge, then the learning algorithm teaches classification by supervisory method. Combining categorization and clustering can increase the accuracy of categorization of health texts.

Keywords:

Scientific Productions , Text Classification , Health , Text Mining , Latent Dirichlet Allocation Model , Thematic Model , Support Vector Machine , Machine Learning

Language:

Persian

Published:

Journal of Information Processing and Management, Volume:35 Issue: 2, 2020

Pages:

553 to 574

https://www.magiran.com/p2103544

دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:

اشتراک شخصی

با ثبت ایمیلتان و پرداخت حق اشتراک سالانه به مبلغ 1,950,000 ريال، بلافاصله متن این مقاله را دریافت کنید.اعتبار دانلود 70 مقاله نیز در حساب کاربری شما لحاظ خواهد شد.

پرداخت حق اشتراک به معنای پذیرش "شرایط خدمات" پایگاه مگیران از سوی شماست.

پست الکترونیکی

اگر مقاله ای از شما در مگیران نمایه شده، برای استفاده از اعتبار اهدایی سامانه نویسندگان با ایمیل منتشرشده ثبت نام کنید. ثبت نام

اشتراک سازمانی

به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!

اطلاعات بیشتر ثبت نام با ایمیل دانشگاهی/سازمانی

توجه!

حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.

In order to view content subscription is required

Personal subscription

Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.

Organization subscription

Please contact us to subscribe your university or library for unlimited access!

More information

سامانه نویسندگان

Author (1)

Shokouhian, Mahboobeh

.Ph.D information science and knowledge, University Of Isfahan, Isfahan, Iran
Author (3)

Shabani, Ahmad

Professor Information and library Science,Faculty of Education and Psychology, University Of Isfahan, Isfahan, Iran
Author (4)

Cheshmehsohrabi, Mozaffar

Professor Department of knowledge and information science, University Of Isfahan, Isfahan, Iran

اطلاعات نویسنده(گان) توسط ایشان ثبت و تکمیل شده‌است. برای مشاهده مشخصات و فهرست همه مطالب، صفحه رزومه را ببینید.

مقالات دیگری از این نویسنده (گان)

Feasibility of Implementing the Standard of Marketing and Public Relations in the Public Libraries
Abbas Shafiee Felavarjani, Ahmad Shabani*, Morteza Mohammadi Ostani
Research on Information Scienc & Public Libraries,
Analysis of the Documents of Research Evaluation in Iran: Towards the National Research Evaluation System (NRES)
Mitra Baghjanati, Mehrdad Cheshmehsohrabi *, Hamid Reza Jamali
Librarianship and Informaion Organization Studies,
Identifying Google search engine optimization techniques and providing a framework to improve the ranking of websites
Bahram Kurd *, Mehrdad Cheshmehsohrabi
Journal of Information Processing and Management,
Using Conceptual Clustering to Extraction of Key Phrases and Related Terms: A Case Study of Scientific Communication Texts
Rajab Kiani Shahvandy, Ahmad Shabani *, Asefe Asemi, Morteza Mohammadi Ostani
Journal of Information Processing and Management,
Researchers’ Experiences in Encountering Citation Ethical Challenges: A Case Study of PhD Students of Isfahan University
Qasem Movahedian *, Asefeh Asemi, Abolfazl Asadnia, Mahbubeh Shokohian
Strategy for Culture,
The information needs of young adults at public Libraries in isfahan
Mahbobeh Shokohian
Journal of Studies in Library and Information Science,

علمی مصوب

پژوهشنامه پردازش و مدیریت اطلاعات

Journal of Information Processing and Management

فصلنامه علوم انسانی

آخرین شماره | آرشیو

ISSN: 2251-8223 eISSN: 2251-8231

تا پاییز 1384 با نام «علوم اطلاع رسانی» منتشر شده است.

صاحب امتیاز:

پژوهشگاه اطلاعات و مدارک علمی ایران

مدیر مسئول:

دکتر محمد حسن زاده

سردبیر:

دکتر سید رحمت الله فتاحی

تلفن نشریه: ۰۲۱-۶۶۴۹۴۹۸۰

اطلاعات بیشتر نشریه

درباره نشریه پیام به نشریه سایت اختصاصی نشریه پذیرش الکترونیکی مقاله