Comparison of the efficiency of data mining methods in predicting type 2 diabetes

Author(s):

Hossein Tireh , Mohammad Taghi Shakeri , Sadegh Rasoulinezhad , Habibollah Esmaily , Razieh Yousefi*

Message:

Article Type:

Research/Original Article (دارای رتبه معتبر)

Abstract:

Background

Diabetes mellitus as a chronic disease is the most common disease caused by metabolic disorders and it is one of the most important health issues all around the world. Nowadays, data mining methods are applied in different fields of sciences due to data mining methods capability. Therefore, in this study, we compared the efficiency of data mining methods in predicting type 2 diabetes.

Methods

In this cross-sectional study, the data of 7,000 participants in the Diabetes Screening Project in Samen, Mashhad City, Iran, were considered in 2016. There were 540 untreated diabetic patients. The Samen Project was included in the routine examinations of diabetes patients like blood glucose, eyes health, nephropathy, and legs health. So, in order to maintain balance, 600 healthy individuals were selected in a proportional volume sampling in this study. Therefore, the total sample size was 1140 people. In this study, people with diabetes aged over 30 years old were enrolled and participants with the previous history of type 2 diabetes, with normal blood glucose due to drug use or other issues at the time of the study, were excluded.

Results

All three models (Logistic regression, simple Bayesian and support vector machine models) had the same test accuracy (86%), however, in terms of area under the receiver operating characteristic (ROC) curve (AUC), logistic regression and simple Bayesian models had better performance (AUC=90% against AUC=88%). In the simple Bayesian model and logistic regression, body mass index (BMI) and age variables were the most important variables, while BMI and blood pressure variables were the most important factors in the support vector machine model.

Conclusion

According to the results, all three models had the same accuracy. In terms of area under the curve (AUC), logistic and simple Bayes models had better performance than the support vector machine model. Totally all three models had almost the same performance. Based on all three models, BMI was the most important variable.

Keywords:

data mining , diabetes mellitus , metabolic diseases , sensitivity , specificity

Language:

Persian

Published:

Tehran University Medical Journal, Volume:77 Issue: 5, 2019

Pages:

301 to 307

magiran.com/p2024739

دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:

اشتراک شخصی

با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!

اشتراک سازمانی

به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!

اطلاعات بیشتر

توجه!

حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.

In order to view content subscription is required

Personal subscription

Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.

Organization subscription

Please contact us to subscribe your university or library for unlimited access!

More information

علمی مصوب

مجله دانشکده پزشکی دانشگاه علوم پزشکی تهران

Tehran University Medical Journal

ماهنامه پزشکی به زبان فارسی و انگلیسی

آخرین شماره | آرشیو

ISSN: 1683-1764 eISSN: 1735-7322

صاحب امتیاز:

دانشگاه علوم پزشکی تهران

مدیر مسئول:

دکتر سید حسن امامی رضوی

سردبیر:

دکتر نادره بهتاش

تلفن نشریه: ۰۲۱-۴۲۹۱۰۷۰۲

اطلاعات بیشتر نشریه

درباره نشریه پیام به نشریه سایت اختصاصی نشریه پذیرش الکترونیکی مقاله راهنمای نویسندگان

سامانه نویسندگان

Author (5)

Esmaily, Habibollah

Professor biostatistics, Mashhad University of Medical Sciences

اطلاعات نویسنده(گان) توسط ایشان ثبت و تکمیل شده‌است. برای مشاهده مشخصات و فهرست همه مطالب، صفحه رزومه را ببینید.

به جمع مشترکان مگیران بپیوندید!

Comparison of the efficiency of data mining methods in predicting type 2 diabetes

Hossein Tireh , Mohammad Taghi Shakeri , Sadegh Rasoulinezhad , Habibollah Esmaily , Razieh Yousefi*

data mining , diabetes mellitus , metabolic diseases , sensitivity , specificity

مجله دانشکده پزشکی دانشگاه علوم پزشکی تهران

Tehran University Medical Journal