Implementation of Experts' Retrieval Model Using Latent Semantic Indexing (LSA) Method and Temporal Graph

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Introduction

Retrieval of experts is a subset of information retrieval that aims to provide a ranking of people who have knowledge in a particular field. Automated expertise work is challenging due to the abundance of expert information and data sources. Many expert approaches in both industry and academia have been proposed using new techniques in information retrieval, data mining, knowledge discovery, statistical modeling, probabilistic modeling, and complex networking. All researchers estimate the relationship between the query and the supporting documents of the expert candidate based on the occurrence of query words in the supporting documents, and they are main and important researches. These models are not capable of semantic communication. Therefore, in this research, the document-oriented method was considered using the LSA recovery model and the use of a time graph

Methodology

The research method is experimental ones, aside from this, survey and library methods have been used. The method used in current study to retrieve articles on LSA or Latent Semantic Analysis, which is based on the articles of the test collection prepared by Web of Science. These documents include English articles in information science and librarianship from 1989 to 2018 is indexed under the category of information science and librarianship on the website. Total number of these articles were 126924 and queries made by users were provided to all these articles. The retrieved documents were judged by relevance and after judging the relevance of the documents by the participants in the study, the performance of the information retrieval model was measured by the evaluation measurements of information retrieval systems. The result of the calculated measures was compared with the value of each of these measures in the basic model. A temporal graph was used to include the time factor. After that, the authors who had the most relevant work and their value of micro index of social network were introduced as experts. Then ten queries from the present research model and the basic model were randomly selected and given to eight people introduced by the second community for judgment and the results were compared.

Findings

According to the innovation used in the current research, which was the application of the information retrieval model of latent semantic analysis, which was finally used to retrieve expert authors, in terms of the amount obtained from each of the information retrieval metrics, i.e., the accuracy level at the level of the first five results, or p@5, mean average precision (MAP) and mean inverse rank (MRR) with values of 0.895, 0.839 and 0.909, respectively, the latent semantic analysis recovery model performed better than the base model. In addition, this is due to the better performance of the retrieval using the dimensionality reduction method compared to keyword matching. In this method, hidden meaning indexing is used, which is a kind of conceptual indexing and uses the statistical method of least squares, and the above indexing is extracted by applying this statistical method. As we know, there are many ways to express a word (synonyms), so it is possible that the query words do not match the words of the document. In addition, most words have multiple meanings (multiple synonyms), so retrieving information based on the concept and meaning of a document is a better approach. LSI assumes that there is a number of latent structures in word usage that are partially blocked by diverse word choices. SVD is used to estimate this structure. The vectors that are obtained statistically strengthen the indicators of meaning more than individual words. The results of other researches also indicate that retrieving documents by matching query keywords with documents is a relatively weaker method. Also, the LSA retrieval model has a better performance in retrieving documents in a large set of documents than in a small set. According to the next innovation of the current research, which was the involvement of the time factor in expert search, and also according to the use of social network indicators and the final relevance judgment, the results showed that the performance of this method is significantly better than the model has been the base. The time factor was included in the retrieval of experts so that people who are no longer alive or who have been around for a long time since their last publication in a certain field are not retrieved. Considering the useful life of publications in the field of knowledge and information science, a ten-year period was involved. After using publication time as the determining factor of expert retrieval, those who had published the most related work were considered as the next determining factor and then the micro indicators of the social network such as degree centrality, betweenness centrality, closeness and special vector are other determining factors that are widely used in scientometric researches and recently in expert retrieval researches. The ten queries proposed in the current research were sent to 8 people who defined the second statistical population of the research, and the results indicated that the performance of the time graph and expert finding performed better by using the factor of the most relevant published works and the factor of micro-indexes of the social network.

Conclusion

LSI assumes that there is a number of latent structures in word usage that are partially blocked by diverse word choices. SVD is used to estimate this structure. The vectors that are obtained statistically strengthen the indicators of meaning more than individual words. The results of other researches also indicate that retrieving documents by matching query keywords with documents is a relatively weaker method. Also, the LSA retrieval model has a better performance in retrieving documents in a large set of documents than in a small set. According to the next innovation of the current research, which was the involvement of the time factor in expert search, and also according to the use of social network indicators and the final relevance judgment, the results showed that the performance of this method is significantly better than the model that has been the base. The time factor was included in the retrieval of experts so that people who are no longer alive or who have been around for a long time since their last publication in a certain field are not retrieved. Considering the useful life of publications in the field of knowledge and information science and, a ten-year period was involved. After using publication time as the determining factor of expert retrieval, those who had published the most related work were considered as the next determining factor and then the micro indicators of the social network such as degree centrality, betweenness centrality, closeness and special vectors are other determining factors that are widely used in scientometric researches and recently in expert retrieval researches. is used. The results showed that the LSA model performed better than the base model for retrieving related documents and the use of time graph showed better performance than the base model.

Language:
Persian
Published:
Library and Information Science Research, Volume:13 Issue: 1, 2023
Pages:
226 to 245
magiran.com/p2593310  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!