Genomic comparison of Chios and Frizarta dairy sheep breeds for somatic cell count using FST unbiased estimator and hapFLK methods
The principal aim of the dairy sheep industry worldwide is to produce high-quality milk. Mastitis is an inflammatory disease in dairy animals that occurs in response to infectious factors and has substantial impacts on animal health, and economic profitability. Somatic cell count has been used as an indirect method to control mastitis. Genetic resistance to mastitis involves linked biological mechanisms resulting from differences in mastitis response, which activate and regulate different levels of the immune response. Over the last decade, interest in identifying genes or genomic regions targeted by selection has grown. Identifying selection signatures can provide valuable insights into the genes or genomic regions that are or have been under selection pressure, which in turn leads to a better understanding of genotype-phenotype relationships. This study aimed to identify effective genes and genomic regions under positive selection in two sheep breeds using selection signature methods. For this purpose, FST and hapFLK analyses were performed using the genome-wide single nucleotide polymorphisms (SNPs).
In this study, data from 585 sheep of different breeds were used to identify genomic regions under selection related to the somatic cell counts in milk. Illumina ovine Bead Chip 50K was used to determine the genotype of the samples. The genomic information of sheep breeds was extracted from the Zenodo database. Quality control on genotyped samples was performed using PLINK v1.9. SNPs with a minor allele frequency (MAF) of less than 0.02 and those with a call rate of less than 0.97 were excluded. In addition, individuals with more than 10% missing genotype data were removed. SNPs that did not conform to Hardy–Weinberg equilibrium (P<10-6) were also eliminated. Following these quality control measures, 41,673 SNPs from 585 sheep were retained for further analysis. To identify the signatures of selection, two statistical methods of FST and hapFLK were used under the software packages FST and hapFLK, respectively. The candidate genes were identified using PLINK v1.9 software and the Illumina gene list in R by SNPs located in the 0.01 percentile of FST and hapFLK values. In addition, the latest published version of the animal genome database was used to define QTLs associated with economically important traits at identified loci. The GeneCards (http://www.genecards.org) and UniProtKB (http://www.uniprot.org) databases were also used to interpret the function of the obtained genes.
Using the FST approach, nine genomic regions on chromosomes 2, 5, 7, 10, 11, 13 (two regions), 17, and 22 were identified. The identified candidate genes associated with somatic cell count in these genomic regions included IL11RA, CDC16, CARD14, BTRC, OTUD4, COL23A1, LACTB, and PRELID3B. Some of the genes located in the identified selection regions were associated with the immune system, innate immune response, inflammation response, cancer disease, and milk production. Some of these genes in the selected regions were consistent with previous studies. The investigation of reported QTLs showed that these regions are related to QTLs of important economic traits, including milk somatic cell count, udder height and depth, clinical mastitis, bovine tuberculosis susceptibility, and heat tolerance. Moreover, the results of hapFLK statistics in this research led to the identification of six genomic regions on chromosomes 3, 4, 5, 7, 10, and 13. The identified candidate genes associated with the somatic cell count in these genomic regions included FAM49A, CDK6, and DLGAP5. Bioinformatics analysis demonstrated that some of these genomic regions overlapped with known genes related to innate immune and various cancers.
Different genes that emerged in the aforementioned regions can be considered candidates for selection based on their function. It was found that most of the selected genes were consistent with some previous studies and were involved in production traits. However, further investigation is recommended to determine the exact function of the identified genes and QTLs. These areas should also be confirmed by other independent studies using larger samples. In general, the data from this study may be used in research on genomic selection and genomic regions associated with mastitis in dairy sheep and in further reviews and evaluations for the improvement of dairy sheep production.