Multilingual Language Models in Persian NLP Tasks: A Performance Comparison of Fine-Tuning Techniques
This paper evaluates the performance of various fine-tuning methods in Persian natural language processing (NLP) tasks. In low-resource languages like Persian, which suffer from a lack of rich and sufficient data for training large models, it is crucial to select appropriate fine-tuning techniques that mitigate overfitting and prevent the model from learning weak or surface-level patterns. The main goal of this research is to compare the effectiveness of fine-tuning approaches such as Full-Finetune, LoRA, AdaLoRA, and DoRA on model learning and task performance. We apply these techniques to three different Persian NLP tasks: sentiment analysis, named entity recognition (NER), and span question answering (QA). For this purpose, we conduct experiments on three Transformer-based multilingual models with different architectures and parameter scales: BERT-base multilingual (~168M parameters) with Encoder only structure, mT5-small (~300M parameters) with Encoder-Decoder structure, and mGPT (~1.4B parameters) with Decoder only structure. Each of these models supports the Persian language but varies in structure and computational requirements, influencing the effectiveness of different fine-tuning approaches. Results indicate that fully fine-tuned BERT-base multilingual consistently outperforms other models across all tasks in basic metrics, particularly given the unique challenges of these embedding-based tasks. Additionally, lightweight fine-tuning methods like LoRA and DoRA offer very competitive performance while significantly reducing computational overhead and outperform other models in Performance-Efficiency Score introduced in the paper. This study contributes to a better understanding of fine-tuning methods, especially for Persian NLP, and offers practical guidance for applying Large Language Models (LLMs) to downstream tasks in low-resource languages.