A Benchmark for Analyzing Knowledge Graph Embedding for Link Prediction Problem in Low-Resource Languages
Author(s):
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Link prediction in knowledge graphs addresses predicting missing entities or relations of a knowledge graph, typically using knowledge graph embedding techniques. Training these models on low-resource-language knowledge graphs presents unique challenges, which have not been thoroughly addressed in the literature, and correspondingly there is no benchmark for evaluating link prediction methods on such graphs. These knowledge graphs often have unique topologies due to the characteristics of low-resource languages. Many knowledge graphs are derived from encyclopedias like Wikipedia, which in low-resource languages may have many propositions from common subjects and/or facts and few from less common ones, leading to distinctive topologies in the extracted knowledge graph. FarsBase, as the knowledge graph related to the Persian language, exemplifies these properties. Originating from Persian Wikipedia, it has some relation-types with many numbers of instances and some other relation types with very few instances. This paper introduces "FarsPishBin," a lightly-pruned version of FarsBase, as a benchmark for low-resource-language knowledge graph embedding. The authors argue that translational models are likely to outperform other embedding models on this benchmark. To check the mentioned hypothesis, the popular embedding models are evaluated on FarsPishBin and the experiments prove that translational models (as expected) perform best. This benchmark aims to serve as a standard platform for future-coming models addressing link prediction in low-resource-language knowledge graphs.
Language:
Persian
Published:
Journal of Soft Computing and Information Technology, Volume:13 Issue: 4, 2025
Pages:
38 to 48
https://www.magiran.com/p2843111
سامانه نویسندگان
مقالات دیگری از این نویسنده (گان)
-
Noor-Vajeh: A Benchmark Dataset for Keyword Extraction from Persian Papers
Mohammadamin Taheri*, Mohammadebrahim Shenassa, Behrouz Minaei-Bidgoli, Sayyed Ali Hossayni
Signal and Data Processing, -
Study on Generative Adversarial Network in Discrete Data: A Survey
Alireza Mohammadi Gohar, Kambiz Rahbar *, Behrouz Minaei-Bidgoli, Ziaeddin Beheshtifard
Journal of Artificial Intelligence and Data Mining, Autumn 2024