The Role of Ontology and Knowledge Graph in Text Document Classification: A Review of Studies

Author(s):

Saiede Khalilian , Mitra Pashootanizade * , Ali Mansouri , Hamidreza Baradaran Kashani

Message:

Article Type:

Review Article (دارای رتبه معتبر)

Abstract:

Purpose

With the increasing use of the internet and the growing volume of electronically accessible documents on the web, automatic text classification has become a critical method for enhancing information retrieval and managing digital text collections. Text classification allows individuals to search for and retrieve information more accurately and quickly. The significance of automatic document classification lies in labeling documents into predefined classes so that documents within a class exhibit the highest similarity and the most remarkable dissimilarity with documents from other classes while utilizing semantic relationships. This study investigates the application of ontology and knowledge graphs in automatic text document classification.

Method

This study reviewed research and documents related to applying semantic tools such as ontologies and knowledge graphs in text document classification. To collect texts, three domestic databases, including the "National Journal Database," the "Scientific Information Database of Jihad University," and "Marefate Danesh," along with three internal databases "Magiran," "SID" and "Civilica" and three external citation databases, such as "Web of Science", "Scopus" and "Google Scholar" It has been examined in both categories, regardless of the period.

Findings

Results of text exploration show that the vector space model does not consider the semantic relationships between words and disregards the word order in sentences. Neglecting the semantic and syntactic relationships between words in natural language provides a different representation of documents. However, ontologies and knowledge graphs help strengthen machine learning models by capturing the meaning of entities and classes. These tools act as an external reference during the classification process and provide domain knowledge for classification models. Using these tools generally allows machines to comprehend the meaning of the data they work with.

Conclusion

The application of ontologies and knowledge graphs in classifying textual documents can strengthen the results of machine learning algorithms through background knowledge. These tools can free the meanings of words from ambiguous sentences and solve problems related to natural language. Using ontology and knowledge graphs can effectively help classify textual documents and improve the accuracy and efficiency of classification models. However, constructing and integrating ontologies and knowledge graphs is a tedious, time-consuming, and complex task that limits the feasibility and practical application of these tools. In the Persian language, in addition to the problems raised in the application of ontologies and knowledge graphs in the classification of documents, there are limitations such as the specific features of the language in writing and technical limitations. Therefore, the use of ontology and knowledge graphs in discussing the classification of textual documents requires attention to linguistic limitations and technical complexity, and the need for further development and efforts is felt, especially in Persian.

Keywords:

Automatic Classification , Text Documents , Knowledge Graph , Ontology , Domain Knowledge

Language:

Persian

Published:

Librarianship and Informaion Organization Studies, Volume:35 Issue: 2, 2024

Pages:

167 to 196

https://www.magiran.com/p2778318

دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:

اشتراک شخصی

با ثبت ایمیلتان و پرداخت حق اشتراک سالانه به مبلغ 1,950,000 ريال، بلافاصله متن این مقاله را دریافت کنید.اعتبار دانلود 70 مقاله نیز در حساب کاربری شما لحاظ خواهد شد.

پرداخت حق اشتراک به معنای پذیرش "شرایط خدمات" پایگاه مگیران از سوی شماست.

پست الکترونیکی

اگر مقاله ای از شما در مگیران نمایه شده، برای استفاده از اعتبار اهدایی سامانه نویسندگان با ایمیل منتشرشده ثبت نام کنید. ثبت نام

اشتراک سازمانی

به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!

اطلاعات بیشتر ثبت نام با ایمیل دانشگاهی/سازمانی

توجه!

حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.

In order to view content subscription is required

Personal subscription

Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.

Organization subscription

Please contact us to subscribe your university or library for unlimited access!

More information

سامانه نویسندگان

Author (3)

Mansouri, Ali

Associate Professor Knowledge and Information Science, University Of Isfahan, Isfahan, Iran

اطلاعات نویسنده(گان) توسط ایشان ثبت و تکمیل شده‌است. برای مشاهده مشخصات و فهرست همه مطالب، صفحه رزومه را ببینید.

مقالات دیگری از این نویسنده (گان)

Evaluating the quality of Isfahan ACECR e-learning system services using the WebQual method and based on the integrated approach of the Kano model and the importance-performance analysis matrix
Mitra Pashootanizadeh *, Elahe Reisi, Ali Mansoori
Library and Information Science Research,
Analysis of the State of Cooperation Between University and Industry from the Aspect of Financial Support
Mahnaz Kamani, Ali Mansouri *
Scientometric research journal,

علمی مصوب

نشریه مطالعات کتابداری و سازماندهی اطلاعات

Librarianship and Informaion Organization Studies

فصلنامه علوم انسانی

آخرین شماره | آرشیو

eISSN: 2783-4646

صاحب امتیاز:

سازمان اسناد و کتابخانه ملی جمهوری اسلامی ایران

مدیر مسئول:

رضا شهرابی فراهانی

سردبیر:

دکتر فهیمه باب الحوائجی

تلفن نشریه: ۰۲۱-۸۱۶۲۳۲۸۲

اطلاعات بیشتر نشریه

درباره نشریه پیام به نشریه سایت اختصاصی نشریه پذیرش الکترونیکی مقاله