Comparison of the Textual Similarity of Representation Elements (Title, Abstract, and Keyword) of Articles in the Citation Network of References with a Research Proposal

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Purpose

The current research compares the representative elements (title, abstract, and keywords) of the articles that existed in the proposal references' citation network with the proposals’ elements. The other goal of this research is to calculate representative elements’ weighted average (title, abstract, and keywords) from a textual similarity perspective.

Methodology

This is an applied and quantitative research that uses citation analysis and content analysis. The research sample is 3019 articles extracted from the citation network of 31 graduated students’ proposals (M.Sc. and Ph.D.) in Chemistry at Shiraz University. All English articles' titles in the proposals' references were searched on the Web of Science database, and each article's file and all articles’ files in its citation network were saved in Excel format. All retrieved files were merged into one file and sorted based on citation count to have the unit citation network for each user's proposal. Because some of the proposals had an extended citation network with more than a thousand articles, 100 articles with the greatest citation count of each network were analyzed to create uniformity and balance among the proposals’ citation networks. Next, the scale of textual similarity of 100 articles' representative elements with the greatest citation count in the citation network, was calculated with the proposal’s title, the proposal’s text, and the titles of the proposal’s references. The scale of textual similarity was checked using designed software based on the Python programming language and measuring the cosine similarity. 

Findings

The results of the Kruskal-Wallis test showed that there was a significant difference between the articles’ representative elements and the title, text, and references’ titles of the proposals from a textual similarity viewpoint; and in all three cases articles’ abstracts had the most textual similarity with the proposal elements, then, the title and keywords of the articles' citation network were in the second and third ranks; In addition, the representative elements’ weighted average was calculated. The obtained value was 0.62 for the abstract, 0.5 for the title, and 0.22 for the keywords, respectively.

Conclusion

Despite the use of different platforms to measure the similarity between the documents searched and the documents desired by the user, there is still a distance to reach the ideal level. Until now, no research had used the representative elements of the articles that existed in the proposal references' citation network to measure the textual similarity with the proposal elements and had not evaluated their capability. The confirmation of textual similarity among the representative elements of the articles that existed in the proposal references' citation network with proposals’ elements, indicates that the student's proposal can be used as a platform for recommending related articles. Hence, the designers of scientific recommender systems, scientific information retrieval systems, digital libraries, and scientific social networks such as LinkedIn, Academia, and ResearchGate can use the elements of articles' citation networks to recommend related articles. In addition, considering the articles’ representative elements as independent units is important not only for similarity measurement but also for keyword expansion and suggesting the appropriate journal to the authors for publishing their articles. According to the determined weight of representative elements and to increase the efficiency of information systems, it is suggested that designers of such systems use the abstracts and the titles of the articles to measure the similarity and avoid calculating the similarity of the texts as a whole unit. This saves time, resources, and energy, presents better results, and users can reach their target and desired information more easily and faster than before. In addition, for indexing articles in databases and search engines, the articles' abstracts and titles can be prioritized to save financial resources and energy.

Language:
Persian
Published:
Scientometric research journal, Volume:9 Issue: 18, 2024
Pages:
23 to 44
magiran.com/p2682022  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!