logistic regression
در نشریات گروه پزشکی-
Background and aims
COVID-19 remains a global health challenge, with vaccination crucial for reducing severe cases. This study evaluated a two-dose COVID-19 vaccine’s effectiveness in lowering hospitalization rates using advanced statistical techniques. This study evaluated the efficacy of a two-dose COVID-19 vaccination regimen in reducing hospitalization rates by employing advanced statistical techniques to control confounding variables in the observational data.
MethodsA retrospective cohort study was conducted among individuals tested for COVID-19 at Mashhad University of Medical Sciences from March 21, 2021, to March 20, 2022. The study population comprised all individuals who underwent polymerase chain reaction testing for COVID-19 during this period. A census sampling method was employed, resulting in a final sample size of 306630 individuals. The participants were classified as “vaccinated” if they received both doses and “unvaccinated” if they received none. Hospitalization was defined as COVID-19-related admissions occurring at least two weeks post-vaccination. The required data were collected from three databases, including the Sina Health Information System, the Healthcare Services Monitoring System, and the Hospital Information System. To create comparable groups, propensity score (PS) matching and weighting were utilized, and a logistic regression model was utilized to estimate the average treatment effect (ATE) of vaccination on hospitalization outcomes.
ResultsAmong the 306630 patients included in the study, 104115 (33.95%) were unvaccinated, while 202515 (66.05%) were vaccinated. Overall, 29458 patients (9.61%) were hospitalized, comprising 28,244 unvaccinated and 1214 vaccinated individuals. Vaccinated individuals exhibited significantly lower odds of hospitalization. The adjusted odds ratio for hospitalization was 0.72 (95% confidence interval [CI]: 0.68–0.76) when using PS weighting, 0.32 (95% CI: 0.30–0.34) with matching, and 0.34 (95% CI: 0.33–0.35) after adjusting for extreme weights.
ConclusionThe findings underscore the protective effects associated with a two-dose COVID-19 vaccination regimen and emphasize the significance of employing robust statistical methods in evaluating real-world data.
Keywords: Propensity Score Matching, Propensity Score Weighting, Causal Effect, Observational Study, Logistic Regression -
Objective
To estimate the effectiveness of two-dose COVID-19 vaccination in reducing hospitalization, accounting for complex confounding factors in observational studies.
MethodsResearchers applied propensity score methods to adjust for confounding variables, comparing their performance to traditional covariate adjustment methods. Multiple Logistic Regression and Propensity Score Matching were employed to analyze the data, ensuring a balanced comparison between vaccinated and unvaccinated groups.
ResultsBoth analytical methods demonstrated a significant reduction in the likelihood of hospitalization among vaccinated individuals. The adjusted odds ratios (OR) were 0.29 (95% CI: 0.26, 0.31) via logistic regression and 0.32 (95% CI: 0.30, 0.34) using propensity score matching.
ConclusionsThe study confirms the effectiveness of two-dose COVID-19 vaccination in decreasing hospitalization. It highlights the importance of using meticulous approaches like propensity score methods to assess real-world impacts in complex observational data settings.
Keywords: Propensity Score Matching, Causal Effect, Observational Study, Logistic Regression -
مقدمه
بیماری قلبی یکی از علل اصلی مرگ ومیر است و پیش بینی می شود تا سال 2030 مرگ ومیر ناشی از بیماری های قلبی- عروقی به 23/3 میلیون نفر افزایش یابد. نارسایی قلبی و شوک کاردیوژنیک سهم بالایی از این مرگ ومیرها دارند و به عنوان اورژانس پزشکی نیازمند درمان به موقع هستند. هدف این پژوهش، پیش بینی سریع مرگ در بیماران نارسایی قلبی با شوک کاردیوژنیک با استفاده از ویژگی های کمتر است.
روش کاراین پژوهش به روش تحلیلی - مقطعی با نمونه گیری تمام شماری صورت گرفت. داده های 201 بیمار قلبی بالای 18 سال که در سال 2020 در بیمارستان روحانی بابل دچار شوک کاردیوژنیک شده بودند، بررسی شدند. از 34 ویژگی مانند سن، سابقه جراحی قلب باز، pH، لاکتات، دیابت و فشارخون استفاده شد و مرگ یک ماهه از طریق تماس تلفنی بررسی شد. برای پیش بینی مرگ از رگرسیون لجستیک و الگوریتم GBM استفاده شد.
یافته هامیانگین سن بیماران 69/44±15/71 سال بود. از این تعداد، 47/7 درصد فوت کردند. چهار ویژگی شامل سن، لاکتات، دیابت و گیجی به عنوان مهم ترین ویژگی ها انتخاب شدند. با یک سال افزایش در سن، احتمال مرگ 7 درصد افزایش می یابد. احتمال مرگ در افراد دیابتی بیش از دوبرابر است. گیجی خطر مرگ را 4 برابر و افزایش لاکتات خطر مرگ را 1/5 برابر افزایش می دهد.
نتیجه گیرینتایج نشان داد انتخاب ویژگی های موثر در پیش بینی مرگ بیماران نارسایی قلبی با شوک کاردیوژنیک با رگرسیون لجستیک و الگوریتم GBM امکان پذیر است و می تواند به بهبود برنامه های ارجاع درمانی و کاهش هزینه های پزشکی کمک کند.
کلید واژگان: نارسایی قلبی، شوک کاردیوژنیک، پیش بینی مرگ، رگرسیون لجستیک، انتخاب ویژگیIntroductionCardiovascular diseases remain a leading global cause of mortality, with ischemic heart disease projected to account for 23.3 million deaths by 2030. Heart failure and cardiogenic shock account for a significant proportion of these deaths and require timely treatment as medical emergencies. This study aims to predict mortality within one month in patients experiencing cardiogenic shock secondary to heart failure using a concise set of predictive features.
MethodAn analytical cross-sectional study was conducted at Babol Razi Hospital, involving 201 adult patients (≥18 years) treated for cardiogenic shock in 2020. Data from 34 clinical variables, including age, history of cardiac surgery, pH levels, lactate concentration, diabetes status, and blood pressure, were meticulously analyzed. Mortality outcomes within one month were assessed via structured telephone follow-up. Logistic regression and Gradient Boosting Machine (GBM) algorithms were used for predictive modeling.
ResultsThe average age of patients was 69.44 ±15.71 years. Among them, 47.7% died. The study identified age, lactate levels, diabetes, and initial confusion as significant predictors of mortality risk. Each additional year of age was associated with a 7% higher probability of mortality. Diabetic patients faced more than double the mortality risk compared to non-diabetics. Confusion at presentation increased the mortality risk fourfold, while elevated lactate levels raised it by 1.5 times.
ConclusionLogistic regression and GBM algorithms demonstrated effectiveness in predicting one-month mortality among cardiogenic shock patients with heart failure based on selected features. This approach holds promise for improving referral processes and reducing costs in healthcare settings.
Keywords: Heart Failure, Cardiogenic Shock, Death Prediction, Logistic Regression, Feature Selection -
Background
Metabolic dysfunction-associated steatotic liver disease (MASLD) represents a significant global health burden without established curative therapies. Early detection and preventive strategies are crucial for effective MASLD management. This study aimed to develop and validate machine-learning (ML) algorithms for accurate MASLD screening in a geographically diverse, large-scale population.
MethodsData from the prospective Fasa Cohort Study, initiated in rural Fars province, Iran (March 2014), were employed for this purpose. The required data were collected using blood tests, questionnaires, liver ultrasonography, and physical examinations. A two-step approach identified key predictors from over 100 variables: (1) statistical selection using mean decrease Gini in random forest and (2) incorporation of clinical expertise for alignment with known MASLD risk factors. The hold-out validation approach (with a 70/30 train/validation split) was utilized, along with 5-fold cross-validation on the validation set. Logistic regression, Naïve Bayes, support vector machine, and light gradient-boosting machine (LightGBM) algorithms were compared for model construction with the same input variables based on area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy.
ResultsA total of 6,180 adults (52.7% female) were included in the study, categorized into 4816 non-MASLD and 1364 MASLD cases with a mean age (±standard deviation [SD]) of 48.12 (±9.61) and 49.47 (±9.15) years, respectively. Logistic regression outperformed other ML algorithms, achieving an accuracy of 0.88 (95% confidence interval [CI]: 0.86-0.89) and an AUC of 0.92 (95% CI: 0.90-0.93). Among more than 100 variables, the key predictors included waist circumference, body mass index (BMI), hip circumference, wrist circumference, alanine aminotransferase levels, cholesterol, glucose, high-density lipoprotein, and blood pressure.
ConclusionIntegration of ML in MASLD management holds significant promise, particularly in resource-limited rural settings. Additionally, the relative importance assigned to each predictor, particularly prominent contributors such as waist circumference and BMI, offers valuable insights into MASLD prevention, diagnosis, and treatment strategies.
Keywords: Logistic Regression, Machine Learning, Non-Alcoholic Fatty Liver Disease, Predictive Models, Rural Area -
مقدمه
سندرم متابولیک (Metabolic syndrome (MetS)) یک وضعیت پیچیده است که به صورت گروهی از اختلالات متابولیک ظاهر می شود و با شیوع برخی از بیماری ها مرتبط است. پیش بینی زودهنگام خطر MetS در جمعیت میانسالان، می تواند برای کنترل و جلوگیری از ابتلا به بیماری های قلبی-عروقی موثر باشد. هدف این مطالعه، استفاده از رگرسیون لجستیک برای پیش بینی سندرم متابولیک و یافتن فاکتورهای خطر مرتبط با این سندرم است.
روش کاردر این مطالعه کوهورت، عوامل مرتبط با سندرم متابولیک در مطالعه Mashhad study، که شامل در مجموع 11570 شرکت کننده است، بررسی شد. با استفاده از رگرسیون لجستیک، فاکتورهایی که نسبت شانس ابتلا به سندرم متابولیک را افزایش می دهند، ارزیابی شد و مدل سازی پیش بینی با استفاده از رگرسیون لجستیک انجام شد.
یافته هانتایج آنالیز با استفاده از مدل رگرسیون لجستیک نشان می دهد عواملی مانند شاخص توده بدنی، سابقه چربی خون بالا، سابقه فشار خون بالا و دیابت، نسبت خطر ابتلا به سندرم متابولیک را افزایش می دهند، همچنین شاخص هایی مانند کم تحرکی، سطح بالای اوره خون، محتوای هموگلوبین گلبول های قرمز، افزایش سن، جنسیت زن، سطوح بالای گاماگلوتامیل ترانسفراز کبدی و اسید اوریک خون، خطر ابتلا به سندرم متابولیک را افزایش می دهند.
نتیجه گیریبه نظر می رسد شاخص توده بدنی، سابقه دیابت و بیماری قلبی در مقایسه با شاخص های دیگر از جمله سابقه چربی خون، فشار خون، کم تحرکی، اوره خون، اسید اوریک و محتوای هموگلوبین گلبول های قرمز خون با افزایش نسبت شانس ابتلا به سندرم رابطه دارد.
کلید واژگان: سندرم متابولیک، عوامل خطر، رگرسیون لجستیک، مدل سازی پیش بینیIntroductionMetabolic syndrome (MetS) is a complex condition manifested as a group of metabolic disorders and is associated with the prevalence of certain diseases. Early prediction of MetS risk in the middle-aged population can be effective in controlling and preventing cardiovascular diseases. This study aimed to use logistic regression to predict metabolic syndrome and identify risk factors related to this condition.
MethodThis cohort study investigated factors associated with metabolic syndrome in the Mashhad study, which included a total of 11,570 participants. Factors that increase the relative risk of metabolic syndrome were evaluated using logistic regression, and predictive modeling was performed using logistic regression.
ResultsThe results of the analysis using the logistic regression model showed that some factors, such as body mass index, history of high blood lipids, history of high blood pressure, and diabetes, increased the risk of metabolic syndrome. Various indicators, such as inactivity, high blood urea level, red blood cell hemoglobin content, aging, female gender, high levels of liver gamma-glutamyl transferase, and blood uric acid increase the risk of developing metabolic syndrome.
ConclusionIt seems that body mass index, history of diabetes, and heart disease are related to the relative risk of developing the MetS syndrome compared to the other indicators, such as history of blood lipids, sedentary blood pressure, blood urea, uric acid, and hemoglobin content of red blood cells. These findings were obtained using the logistic regression model.
Keywords: Logistic Regression, Metabolic Syndrome, Predictive Modeling, Risk Factors -
Background
Thalassemia is an inherited blood disorder with a defect in the sufficient production of a protein called hemoglobin. We aimed to investigate the simple blood indices of patients with Beta Thalassemia Trait (BTT) and Iron Deficiency Anemia (IDA) to propose a new formula using logistic regression for differentiate two characteristics from each other.
MethodsAmong the 702 records of the BTT Counseling Center (Khoy, Iran-2022), 292 cases (219 iron deficiency anemia (IDA) and 73 BTT) were eligible for the study. Blood indices such as RBC, HGB, HbA2 described and used to diagnose two types of participants. Blood indices had high multicollinearity that was modified. Logistic regression for blood indices fitted and goodness of fit indices with Area Under ROC curve (AUC) estimated.
ResultsThe average age of the participants was 24.56 yr. The status of Multicollinearity between independent variables was modified. The HGB, MCV, HbA2, and HbA variables were used in the model and only HbA2 status was significant (P<0.001). According to the output of the model, for each unit increase in HbA2, the chance of having BTT was about 8.5 times higher than IDA. The sensitivity, specificity, AUC curve, and accuracy of the final model were estimated to be 97, 72, 84, and 93%, respectively. A regression formula to differentiate BTT from IDA proposed.
ConclusionIn studies related to the differentiation of the BTT from IDA, the presence of the HbA2 index in the model and prediction is very necessary.
Keywords: Beta thalassemia, Iron deficiency anemia, Logistic regression, Blood index, Differentiation formula -
IntroductionThis study aims to identify the risk factors affecting mortality among Covid-19 patients in the southeast of Iran.MethodsThis cross-sectional study used data from the Covid-19 patients admitted to Afzalipur Teaching Hospital in Kerman, Iran, from February 2020 to September 2021. The demographic and clinical data of 6,057 patients were analyzed using Bayesian network and logistic regression models.FindingsOut of 6,057 patients, 333 patients (5.5%) died. The most important risk factors for Covid-19 mortality were age, gender, fever, headache, decreased level of consciousness (LoC), chronic liver disease, blood oxygen level (BOL), admission season, and length of stay (LoS). Fever, headache, and longer LoS were protective and mortality-reducing variables.ConclusionFollowing model estimation results, it is recommended that old male patients with low oxygen levels and a lower LoC, as well as patients with chronic liver disease, receive additional medical care and not be discharged prematurely. Early medical interventions for high-risk patients may reduce the Covid-19 mortality risk.Keywords: Bayesian Network, Logistic Regression, Mortality, COVID-19
-
Introduction
Although several studies have been published about COVID-19, ischemic stroke is known yet as a complicated problem for COVID-19 patients. Scientific reports have indicated that in many cases, the incidence of stroke in patients with COVID-19 leads to death.
ObjectivesThe obtained mathematical equation in this study can help physicians’ decision-making about treatment and identification of influential clinical factors for early diagnosis.
MethodsIn this retrospective study, data from 128 patients between March and September 2020, including their demographic information, clinical characteristics, and laboratory parameters were collected and analyzed statistically. A logistic regression model was developed to identify the significant variables in predicting stroke incidence in patients with COVID-19.
ResultsClinical characteristics and laboratory parameters for 128 patients (including 76 males and 52 females; with a mean age of 57.109±15.97 years) were considered as the inputs that included ventilator dependence, comorbidities, and laboratory tests, including WBC, neutrophil, lymphocyte, platelet count, C-reactive protein, blood urea nitrogen, alanine transaminase (ALT), aspartate transaminase (AST) and lactate dehydrogenase (LDH). Receiver operating characteristic–area under the curve (ROC-AUC), accuracy, sensitivity, and specificity were considered indices to determine the model capability. The accuracy of the model classification was also addressed by 93.8%. The area under the curve was 97.5% with a 95% CI.
ConclusionThe findings showed that ventilator dependence, cardiac ejection fraction, and LDH are associated with the occurrence of stroke and the proposed model can predict the stroke effectively.
Keywords: Logistic regression, Stroke, COVID-19, Prediction, SARS-CoV-2 -
Background and Purpose
Identifying effective symptoms, demographic information, and underlying diseases to predict COVID-19 mortality is essential. We aimed to study the effective clinical and symptomatic characteristics of COVID-19 mortality in hospitalized patients with positive polymerase chain reaction (PCR) test results.
Materials and MethodsFor this study, we prospectively collected complete data on 26867 hospitalized individuals who tested PCR positive for COVID-19 from February 20, 2020, to September 12, 2021, in the Khorasan Razavi Province, Iran. We analyzed the data using artificial neural networks (ANN) and logistic regression (LR) models.
ResultsThe accuracy of the ANN model was higher than the LR (90.27% versus 90.15%). The ten most important predictors that contributed to the prediction of death were decreasing consciousness level, cough, PO2 level, age, chronic kidney disease, fever, headache, smoking status, chronic blood diseases, and diarrhea using the ANN model.
ConclusionIn conclusion, individuals suffering from chronic diseases such as cancer, kidney and blood diseases, as well as immunodeficiency are at a higher risk of mortality. This important finding can help decision-makers and medical professionals in their efforts to consider these conditions and provide effective preventative measures to reduce the risk of death.
Keywords: Machine learning, SARS-CoV-2, COVID-19 diagnostic testing, Logistic regression, Neural network -
Introduction
In this specific research study, a remarkably accurate and significantly simplified approach has been presented.
Material and MethodsThis research is encompasses three crucial stages. Firstly, the length of the signal is effectively diminished to an optimal magnitude through the utilization of a technique widely known as windowing. This technique plays a pivotal role in reducing the signal to an ideal size, ensuring the subsequent stages are executed with utmost precision. Secondly, the pertinent features are extracted from the shortened signal, specifically focusing on the Fractal Dimension, the Hurst Exponent, and the Ratio of Determinism to Recurrence Rate. These features are chosen due to the inherent nonlinear nature of the signal, as they provide valuable insights into the complex patterns and structures present within. Lastly, the extracted features undergo Logistic Regression, a widely employed classification algorithm, to effectively categorize and classify them. This step plays a crucial role in providing a clear and concise understanding of the underlying characteristics of the signal.
ResultsThe implementation of the proposed method not only achieves an outstanding accuracy rate of 99.66%, but it also exhibits a linear time complexity, ensuring efficient processing. Additionally, this method leads to a significant reduction in the length of EEG signals, which is of utmost importance in practical applications.
ConclusionThe primary objective of this proposed method revolves around the introduction of an online approach that can seamlessly integrate into healthcare systems. This objective is derived from a comprehensive analysis and evaluation of the obtained results, ensuring the method's practicality and effectiveness are thoroughly assessed.
Keywords: Nonlinear Features, Logistic Regression, Epilepsy Detection, Linear Complexity -
Background
The development of data mining techniques and the adaptive neuro-fuzzy inference system (ANFIS) in the last few decades has made it possible to achieve accurate predictions in medical fields.
ObjectivesThe present study aimed to use the ANFIS model, artificial neural network (ANN), and logistic regression to predict thyroid patients.
MethodsThis study aimed to predict thyroid disease using the UCI database, ANFIS and ANN models, and logistic regression. We only used four of its features as the input of the model and considered thyroid as a binary response (occurrence=1, non-occurrence=0) as the output of the model. Finally, three models were compared based on the accuracy and the area under the curve (AUC).
ResultsIn this study, out of the extensive UCI database, which includes 3,772 samples and over 20 features, only five specific features were utilized. Data include 1,144 males and 2,485 females. The results of multiple logistic regression analysis demonstrated that free T4 index (FTI) and thyroid stimulating hormone (TSH) had a significant effect on thyroid. The ANFIS model had a higher accuracy (99%) compared to ANN (96%) and the logistic regression model (94%) in the prediction of thyroid.
ConclusionAs evidenced by the obtained results, the forecasting performance of ANFIS is more efficient than other models. Moreover, the use of combined methods, such as ANFIS, to diagnose and predict diseases increases the accuracy of the model. Therefore, the results of this study can be used for screening programs to identify people at risk of thyroid disease.
Keywords: Adaptive neuro-fuzzy inference systems, Artificial neural network, Logistic regression, Thyroid -
Background
The objective of the present study was to identify prognostic factors associated with mortality and transfer to intensive care units (ICUs) in hospitalized COVID-19 patients using random forest (RF). Also, its performance was compared with logistic regression (LR).
MethodsIn this retrospective cohort study, information of 329 COVID-19 patients were analyzed. These patients were hospitalized in Besat hospital in Hamadan province, the west of Iran. The RF and LR models were used for predicting mortality and transfer to ICUs. These models' performance was assessed using area under the receiver operating characteristic curve (AUC) and accuracy.
ResultsOf the 329 COVID-19 patients, 57 (15.5%) patients died and 106 (32.2%) patients were transferred to ICUs. Based on multiple LR model, there was a significant association between age (OR=1.02; 95% CI=1.00-1.05), cough (OR=0.24; 95% CI=0.10-0.56), and ICUs (OR=7.20; 95% CI=3.30-15.69) with death. Also, a significant association was found between kidney disease (OR=3.90; 95% CI=1.04-14.63), decreased sense of smell (OR=0.28; 95% CI=0.10-0.73), Kaletra (OR=2.53; 95% CI=1.39-4.59), and intubation (OR=8.32; 95% CI=3.80-18.24) with transfer to ICUs. RF showed that the order of variable importance has belonged to age, ICUs, and cough for predicting mortality; and age, intubation, and Kaletra for predicting transfer to ICUs.
ConclusionThis study showed that the performance of RF provided better results compared to LR for predicting mortality and ICUs transfer in hospitalized COVID-19 patients.
Keywords: COVID-19, Mortality, Intensive care units, Random forest, Logistic regression -
Background
We aimed to establish and validate diagnostic models for distinguishing bacterial/viral infections among sepsis neonates and also a model for prognostic evaluation.
MethodsTraining data sets (cohorts) of neonatal sepsis patients were derived retrospectively from 2017 to 2019, and the verifying sets were followed up from 2019 to 2021. The backward elimination method of logistic regression was used in identifying the optimum feature combination by adding all potential factors to the regression equation.
ResultsThe current study established 3 models. For distinguishing bacterial sepsis patientsandbacterial culture-negative patients, we found Y=1.930+0.105X1+0.891X2-1.389X3-0.774X4 (Y symbolizes the status of bacterial infectious sepsis, X1 is age increase, X2 is intra-amniotic infection (mother), X3 is vomiting sign, and X4 is cough sign). Similarly, for distinguishing bacterial infectious sepsis patients and bacterial/viral double-positive patients, we found Y=2.918+1.568X1+1.882X2-0.113X3-2.214X4-2.255X5-2.312X6 (Y means the bacterial/viral double-positive status, X1 is IL-6 increase, X2 means CRP increase, X3 means age increase, X4 means high fever sign, X5 is cyanotic sign, and X6 is HGB increase). For predicting hospital days as one of the prognoses, we found Y=-1.993+0.073X1+1.963X2+0.466X3-0.791X4-0.633X5 (Y means worse prognosis, which is hospital days longer than 7 days, X1 means age increase, X2 means intra-amniotic infection (mother), X3 is IL-6 increase, X4 is convulsion with unconsciousness, and X5 is cough sign). Then, the ROC curves of the models from the verifying cohort indicated that all of the 3 models had good performance among sepsis children.
ConclusionsTwo diagnostic models and one prognostic model were established for clinical reference from the current first-step analysis with excellent model performance, which could be suggested as new useful diagnostic tools and a therapeutic strategy guiding marker for neonatal sepsis in the future.
Keywords: Neonatal Sepsis, ROC, Prognosis, Logistic Regression, Diagnosis Model -
Introduction
We aimed to use polymerase chain reaction (PCR) on genomic deoxyribonucleic acid (DNA) to detect the Db allele and the rs2923234 and rs1049112 single nucleotide polymorphisms (SNPs) of the salivary acidic proline-rich proteins (PRPs) to determine their relationship with dental caries in young children.
MethodsDNA was extracted from saliva samples of preschool children aged 3 to 5 years. PCR primers designed around exon 3 of the PRH1 locus yielded a 416-base product representing Db for gel electrophoresis and a 519-base product representing the rs2923234 and rs1049112 SNPs for Sanger sequencing. The data were analyzed using a logistic regression model and a multilayer perceptron artificial neural network.
ResultsForty children with severe caries and 40 caries-free children completed the study. The frequency of the Db gene was 16.3% in the entire study group. The rs2923234 SNP was a marginally significant (P=0.053) predictor for the dependent variable (caries-free or severe caries). However, the rs1049112 (P=0.407) and the Db allele (P=0.442) were not significant predictors.
ConclusionThe rs29232334 SNP could be considered a potential genetic predictor for caries susceptibility.
Keywords: Acidic proline-rich protein, Polymerase chain reaction, Caries, Children, Salivary biomarker, Genetics, Db allele, RH1 gen, Single nucleotide polymorphisms, Multilayer perceptron, Artificial neural network, Logistic regression -
زمینه و هدف:
آنژیوگرافی یک روش متداول در تشخیص درگیری عروق قلبی است. علاوه بر تهاجمی بودن این روش تشخیصی، برخی بیماران به دلایلی همچون ترس، هزینه بالای تست و عد ماعتماد به تشخیص ضرورت آنژیوگرافی ازانجام این تست سرباز م یزنند. هدف در این مطالعه، تعیین و مدلسازی عوامل مرتبط با مسدود شدن عروق قلبی جهت پیش آگاهی از نتایج آنژیوگرافی است.
روش بررسی:
در این مطالعه مقطعی تحلیلی، 1187 بیمار که به تشخیص پزشک معالج کاندید آنژیوشده و در طی سا لهای 1390 - 1391 برای انجام آنژیوگرافی به بیمارستان قایم مشهد مراجعه کرده بودند، وارد شدند. اطلاعات جمعی تشناختی و متغیرهای سطو حلیپید، قندخون و سابقه ابتلا به بیمار یهای زمین های جهت برازش در مدل آماری بررسی شدند. با کمک نر مافزار R 3.6.1 دو مدل رگرسیون لجستیک و دوجمل های منفی با انباشتگی در صفر به داده ها برازش داده شدند و ازنظر صحت پی شبینی با یکدیگر مقایسه شدند.
یافته ها:
نتیجه آنژیوگرافی نشان داد 34 درصد 404 بیمار تعداد صفر رگ مسدود دارند. در هر دو مدل مشاهده شد که شانس گرفتگی عروق به طور معناداری در مردان و در افراد دیابتی بیشتر بود. همچنین با افزایش سن احتمال مثبت شدن نتیجه آنژیو افزایش می یابد P<0/05 . سطح زیر منحنی راک حساسیت، ویژگی برای رگرسیون لجستیک و دو جمل های منفی با انباشتگی در صفر ب هترتیب برابر با 78/4 70/4 ، 70/5 و 2/ 78 4/ 71 ، 5/ 71 به دست آمده است.
نتیجه گیری:
متغیرهای سن، جنسیت و دیابت به عنوان عوامل اثرگذار بر نتایج آنژیوگرافی در دو مدل به دست آمدند. با توجه به نتایج تحقیق و توان پی شبینی مد لها، اختلاف معناداری بین دو مدل مشاهده نشد. با توجه به ساد ه تر بودن مدل رگرسیون لجستیک م یتوان از این مدل به عنوان یک مدل پیشگوی یکننده در تعیین ضرورت انجام آنژیوگرافی استفاده کرد.
کلید واژگان: آنژیوگرافی، عروق کرونر، پیش بینی، رگرسیون لجستیک، منحنی راکPrediction of Angiography Results Using Logistic Regression and ZeroinflatedNegative Binomial ModelsBackground and ObjectivesAngiography is a common and invasive method in diagnosing cardiovasculardiseases. Some patients refuse to perform angiography due to reasons such as fear, high cost, and lackof confidence in the decision of physician for angiography. This study aims to determine the factorspredicting coronary artery occlusion to predict the outcome of angiography.
Subjects and MethodsIn this cross-sectional study, participants were 1187 patients received angiographyin Ghaem Hospital in Mashhad, Iran. Demographic data, lipid profile, blood sugar level, and history ofunderlying disorders were used in two prediction models of logistic regression and zero-inflated negativebinomial (NB), fitted using R3.6.1 software. Then, their sensitivity and specificity were compared.
ResultsOf 1187 patients, 404 (34%) had negative angiography. The results of both models showed thatthe risk of positive angiography was significantly higher in male and diabetic patients. The risk increasedwith the increase of age. The area under the ROC curve (sensitivity and specificity) for logistic regressionand zero-inflated NB models were 78.4(70.4%, 70.5%) and 78.2(71.4%, 71.5%).
ConclusionAge, gender, smoking, and history of diabetes are significant predictors of the angiographyoutcome. There is no significant difference between logistic regression and zero-inflated NB models inpredicting the outcome of angiography. Due to the ease of use of logistic regression model, it can beused to predict the results of angiography.
Keywords: Angiography, Coronary Artery Disease, prediction, logistic regression, ROC curve -
Background
The accurate diagnosis of cardiac disease is vital in managing patients’ health. Data mining and machine learning techniques play an important role in the diagnosis of heart disease. We aimed to examine the diagnostic performances of an adaptive neuro-fuzzy inference system (ANFIS) for predicting coronary artery disease and compare this with two statistical methods flexible discriminant analysis (FDA) and logistic regression (LR).
MethodsThe data of this study is the result of descriptive-analytical research from the study of Mashhad. We used ANFIS, LR, and FDA to predict coronary artery disease. A total of 7385 subjects were recruited as part of the Mashhad Stroke and Heart Atherosclerotic Disorders (MASHAD) cohort study. The data set contained demographic, serum biochemical parameters, anthropometric, and many other variables. To evaluate the ability of trained ANFIS, LR, and FDA models to diagnose coronary artery disease, we used the Hold-Out method.For analyzing data, we used SPSS v25, R 4.0.4, and MATLAB 2018 software.
ResultsThe accuracy, sensitivity, specificity, Mean squared error (MSE) , and area under the roc curve (AUC) for ANFIS were 83.4%, 80%, 86%, 0.166 and 83.4%. The corresponding values based on the LR method were 72.4%, 74%, 70% , 0.175 and 81.5% and for the FDA method, these measurements were 77.7%, 74%, 81%, 0.223, and 77.6%, respectively.
ConclusionThere was a significant difference between the accuracy of these three methods. The present findings showed that ANFIS was the most accurate method for diagnosing coronary artery disease compared with LR and FDA methods. Thus, it could be a helpful tool to aid medical decision-making for the diagnosis of coronary artery disease.
Keywords: Adaptive Neuro-Fuzzy Inference System, Logistic Regression, Flexible Discriminant Analysis, Coronary Artery Disease -
Introduction
Breast cancer is one of the most common cancers among women compared to all other ones. Machine learning (ML) techniques can bring a large contribute on the process of prediction and early diagnosis of breast cancer, became a research hotspot and has been proved as a strong technique. Using ML models performed on multidimensional dataset, this article aims to find the most efficient and accurate ML models for tumor classification prediction.
Material and MethodsSeveral supervised ML algorithms were utilized to diagnosis and prediction of cancer tumor such as Logistic Regression Decision Tree, Random Forest and KNN. The algorithms are applied to a dataset taken from the UCI repository including 699 samples. The dataset includes Breast cancer features. To enhance the algorithms’ performance, these features are analyzed, the feature importance score and cross validation are considered. In this research, ML algorithms improved coupled by limited and selective features to produce high classification accuracy in tumor classification.
ResultsAs a result of evaluation, Logistic Regression algorithm with accuracy value equal to 99.14%, AUC ROC equal to 99.6%, Extra Tree algorithm with accuracy value equal to 99.14% and AUC ROC equal to 99.1% have better performance than other algorithms. Therefore, these techniques can be useful for diagnosis and prediction of cancer tumor and prescribe it correctly.
ConclusionThe technique of ML can be used in medicine for analyzing the related data collections to a disease and its prediction. The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of ML to evaluate breast cancer and indeed, the diagnosis and prediction of breast cancer is compared to determine the most appropriate classifier.
Keywords: Machine Learning, Dataset, Importance Score, Accuracy, Breast Cancer, Logistic Regression -
مجله دیابت و متابولیسم ایران، سال بیست و سوم شماره 1 (پیاپی 110، فروردین و اردیبهشت 1402)، صص 53 -67مقدمه
دیابت سالانه باعث مرگ ومیر فراوانی می شود و تعداد افراد زیادی که به این بیماری مبتلا هستند به اندازه ی کافی وضعیت سلامت خود را درک نمی کنند. این مطالعه یک مدل مبتنی بر داده کاوی به منظور تشخیص و پیش بینی زودهنگام دیابت پیشنهاد می کند.
روش هابا وجود اینکه تکنیک کا-میانه ساده است و می توان آن را برای طیف گسترده ای از انواع داده ها استفاده کرد، اما نسبت به موقعیت های اولیه مراکز خوشه که نتیجه ی نهایی خوشه را تعیین می کنند بسیار حساس است، به طوری که یا یک مجموعه داده ی خوشه بندی شده مناسب و کارا را برای مدل رگرسیون لجستیک فراهم می کند و یا مقدار کمتری داده را در نتیجه ی خوشه بندی ناصحیح مجموعه داده ی اصلی ارایه می دهد. از این رو، عملکرد مدل رگرسیون لجستیک را محدود می کند. هدف اصلی این مقاله تعیین راه های بهبود خوشه بندی کا-میانه و نتیجه ی دقت رگرسیون لجستیک است. از این رو، الگوریتم پیشنهادی شامل تکنیک های تحلیل مولفه های اصلی، کا-میانه و مدل رگرسیون لجستیک است.
یافته هانتایج به دست آمده از این مطالعه نشان می دهد که توانایی به دست آوردن نتیجه دقت خوشه بندی کا-میانه بسیار بالاتر از آن چیزی است که سایر محققان در مطالعات مشابه به دست آورده اند. همچنین در مقایسه با نتایج به دست آمده از سایر الگوریتم ها، مدل رگرسیون لجستیک در سطح بهبود یافته ای در پیش بینی شروع دیابت اجرا شد. مزیت واقعی دیگر این است که الگوریتم پیشنهادی توانست با موفقیت یک مجموعه داده ی جدید را مدل کند.
نتیجه گیریبه طور کلی، رویکرد پیشنهادی می تواند به شکل تاثیرگذاری در پیش بینی و تشخیص زودهنگام دیابت استفاده شود.
کلید واژگان: دیابت، پیش بینی، تحلیل مولفه های اصلی، کا-میانه، رگرسیون لجستیکBackgroundDiabetes entails a great quantity of deaths each year and a great quantity of people living with the disease do not find out their health status early sufficient. In this paper, we advance a data mining-based model for prematurely diagnosis and prediction of diabetes.
MethodsAlthough K-means is simple and can be utilized for a vast diversity of data kinds, it is wholly sensitive to initial locations of cluster centers which specify the final cluster result, which either enables an efficiently and adequate clustered dataset for the logistic regression model, or presents a lesser amount of data as a result of wrong clustering of the main dataset, thereby restricting the proficiency of the logistic regression model. The main purpose of this study is was to specify procedures of ameliorating the k-means clustering and logistic regression accuracy consequence. Therefore, our algorithm comprises of principal component analysis technique, k-means technique and logistic regression model.
ResultsThe results obtained from this study show that the ability to obtain the result of K-means clustering accuracy is much higher than what other researchers have obtained in similar studies. Also, compared to the results obtained from other algorithms, the logistic regression model was implemented at an improved level in predicting the onset of diabetes. Another real advantage is that the proposed algorithm was able to successfully model a new dataset.
ConclusionIn general, the proposed approach can be effectively used in predicting and early diagnosis of diabetes.
Keywords: Diabetes, Prediction, Principal component analysis, K-means, Logistic regression -
Introduction
Estimating the probability of obstructive coronary artery disease in patients undergoing noncoronary cardiac surgery should be considered compulsory. Our study sought to evaluate the prevalence of obstructive coronary artery disease in patients undergoing valvular heart surgery and to utilize predictive methodology of concomitant obstructive coronary artery disease in these patients.
MethodsThe retrospective study cohort was derived from a tertiary care hospital registry of patients undergoing coronary angiogram prior to valvular heart operations. Decision tree, logistic regression, and support vector machine models were built to predict the probability of the appearance of obstructive coronary artery disease. A total of 367 patients from 2016 to 2019 were analyzed.
ResultsThe mean age of the study population was 57.3±9.3 years, 45.2% of the patients were male. Of 367 patients, 76 (21%) patients had obstructive coronary artery disease. The decision tree, logistics regression, and support vector machine models had an area under the curve of 72% (95% CI: 62% - 81%), 67% (95% CI: 56% - 77%), and 78% (95% CI: 68% - 87%), respectively. Multivariate analysis indicated that hypertension (OR 1.98; P=0.032), diabetes (OR 2.32; P=0.040), age (OR 1.05; P=0.006), and typical angina (OR 5.46; P<0.001) had significant role in predicting the presence of obstructive coronary artery disease.
ConclusionOur study revealed that approximately one-fifth of patients who underwent valvular heart surgery had concomitant obstructive coronary artery disease. The support vector machine model showed the highest accuracy compared to the other model.
Keywords: Obstructive Coronary Artery Disease, Valvular Heart Surgery, Support Vector Machine, Logistic Regression, Decision Tree -
Introduction
We aimed to evaluate the possible role of the age, occlusion type, type of dentition (full dentition or free-end extensions), and type of temporomandibular disorders (TMD) to predict the presence of pain.
MethodsSubjects were selected from volunteer male TMD patients with one partially edentulous jaw from the Baghdad city in 2022. Pain was assessed via the Visual Analogue Scale (VAS). Angle’s and Kennedy’s classifications were employed to assess occlusion and partially edentulous jaw conditions. TMD was assessed using Diagnostic Criteria for Temporomandibular Disorders (DC/TMD) for Clinical and Research Applications. Relationship between pain as ordinal dependent variable and other predictor variables was assessed via ordinal logistic regression using SPSS 26.
Results240 subjects were assessed for eligibility and 180 TMD patients (mean age 41.1 ± 0.46) were included in the study. The omnibus test showed that the model outperforms the null model (p < 0.001). Disc displacement with reduction (odds ratio: 0.09) and Kennedy’s Class I (odds ratio: 0.42) were statistically significant inverse predictors for pain (p < 0.05). Age (p= 0.66) and Angle’s occlusion type (p= 0.91) were not significant predictors for pain.
ConclusionProbability of pain decreased in disc displacement with reduction and Kennedy’s Class I.
Keywords: logistic regression, occlusion, orofacial pain, pain, temporomandibular disorders, temporomandibular joint
- نتایج بر اساس تاریخ انتشار مرتب شدهاند.
- کلیدواژه مورد نظر شما تنها در فیلد کلیدواژگان مقالات جستجو شدهاست. به منظور حذف نتایج غیر مرتبط، جستجو تنها در مقالات مجلاتی انجام شده که با مجله ماخذ هم موضوع هستند.
- در صورتی که میخواهید جستجو را در همه موضوعات و با شرایط دیگر تکرار کنید به صفحه جستجوی پیشرفته مجلات مراجعه کنید.