Multilingual Language Models in Persian NLP Tasks: A Performance ‎Comparison of Fine-Tuning Techniques

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:

This paper evaluates the performance of various fine-tuning methods in Persian natural language ‎processing (NLP) tasks. In low-resource languages like Persian, ‎which suffer from a lack of rich and sufficient data for training large ‎models, it is crucial to select appropriate fine-tuning techniques that ‎mitigate overfitting and prevent the model from learning weak or ‎surface-level patterns. The main goal of this research is to compare ‎the effectiveness of fine-tuning approaches such as Full-Finetune, ‎LoRA, AdaLoRA, and DoRA on model learning and task ‎performance. We apply these techniques to three different Persian ‎NLP tasks: sentiment analysis, named entity recognition (NER), and ‎span question answering (QA). For this purpose, we conduct ‎experiments on three Transformer-based multilingual models with ‎different architectures and parameter scales: BERT-base multilingual ‎‎(~168M parameters) with Encoder only structure, mT5-small ‎‎(~300M parameters) with Encoder-Decoder structure, and mGPT ‎‎(~1.4B parameters) with Decoder only structure. Each of these ‎models supports the Persian language but varies in structure and ‎computational requirements, influencing the effectiveness of ‎different fine-tuning approaches. Results indicate that fully fine-‎tuned BERT-base multilingual consistently outperforms other ‎models across all tasks in basic metrics, particularly given the unique ‎challenges of these embedding-based tasks. Additionally, lightweight ‎fine-tuning methods like LoRA and DoRA offer very competitive ‎performance while significantly reducing computational overhead ‎and outperform other models in Performance-Efficiency Score ‎introduced in the paper. This study contributes to a better ‎understanding of fine-tuning methods, especially for Persian NLP, ‎and offers practical guidance for applying Large Language Models ‎‎(LLMs) to downstream tasks in low-resource languages.‎

Language:
English
Published:
Journal of Artificial Intelligence and Data Mining, Volume:13 Issue: 1, Winter 2025
Pages:
107 to 117
https://www.magiran.com/p2844103