Impact of marker density and reference population size on accuracy of imputation in simulated data
Author(s):
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
In this study, effect of the reference population size and the number of missing single nucleotide polymorphisms (SNPs) on imputation accuracy was assessed. The QMSim software was used to create a reference database of 1000 simulated animals. Two datasets were created from the database reference: The first dataset (A), included original genotypes, containing the missing SNPs (52,000 SNP markers), and the second one (B) included the same genotypes without the missing data (37,000 SNP markers). In both datasets, animals were simulated for a reference population with the size of 100, 250, 500 and 750. The deleted SNPs were simulated randomly in both datasets with the proportion of 15%, 30%, 55%, 70%, and 95%. The accuracy was determined based on the correlation between the original SNP values before deletion and its values after imputation. The results of this study showed that the accuracy of the imputation was influenced by the size of reference population and density of the deleted SNP markers. By increasing the reference population size from 100 to 750 animals in both datasets, the average accuracy of the imputation was increased. The highest accuracy in the reference population of 750 animals was from 0.89 to 0.98 in dataset A and 0.90 to 0.99 in dataset B. Generally, the results showed that if the size of the reference population is sufficient, the imputation accuracy does not much change, despite large number of missing SNPs.
Keywords:
Language:
Persian
Published:
Animal Production Research, Volume:9 Issue: 2, 2020
Pages:
15 to 22
https://www.magiran.com/p2158980