Predictive Modeling and Spatial Analysis of Cervix Uteri and Breast Cancer in India using Machine Learning and Big Data Frameworks

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Background

Cancer remains a critical public health issue in India, with rising cases of breast cancer and cervical cancer. Accurate predictions and spatial analysis of cancer incidence are essential for shaping prevention strategies and targeting interventions in high-risk regions.

Methods

This study utilized a big data framework employing machine learning techniques from the SparkML library to predict cancer cases and analyze spatial distributions across Indian states from 2016 to 2021. Three machine learning models used Random Forest Regressor, Gradient Boosting Regressor, and Geographically Weighted Regression (GWR) were applied to the dataset. Spatial autocorrelation analysis used Moran’s I statistic to identify clustering patterns.

Results

The spatial analysis revealed significant clustering of cancer cases, particularly in 2020, with a z-score of 2.23, a p-value of 0.02, and a Moran’s index of 0.15. Among the machine learning models, GWR achieved a predictive accuracy of 98% for both breast cancer and cervical cancer, while the Random Forest Regressor and Gradient Boosting Regressor achieved 95% and 97% accuracy, respectively, over the six-year period. Gradient Boosting outperformed other models in identifying key predictors and ensuring high predictive accuracy.

Conclusions

The findings highlight the efficacy of Gradient Boosting and GWR in predicting cancer incidence and analyzing spatial patterns. These models provide critical insights into cancer clustering and risk factors, supporting the development of targeted prevention strategies and policy interventions for high-risk regions in India. The results emphasize the utility of machine learning techniques in public health research and cancer control.

Language:
English
Published:
Iranian Journal of Blood and Cancer, Volume:16 Issue: 4, Dec 2024
Pages:
20 to 29
https://www.magiran.com/p2831219