Using Latent Variables to Eliminate Multicollinearity Effect in A Logistic Regression on Risk Factors for Breast Cancer

Message:
Abstract:
Background And Objectives
Logistic regression is one of the most widely used generalized linear models for analysis of the relationships between one or more explanatory variables and a categorical response. Strong correlations among explanatory variables (multicollinearity) reduce the efficiency of model to a considerable degree. In this study we used latent variables to reduce the effects of multicollinearity in the analysis of a case-control study.
Methods
Our data came from a case-control study in which 300 women with breast cancer were compared to 300 controls. Five highly correlated quantitative variables were selected to assess the effect of multicollinearity. First, an ordinary logistic regression model was fitted to the data. Then, to remove the effect of multicollinearity, two latent variables were generated using factor analysis and principal components analysis methods. Parameters of logistic regression were estimated using these latent as explanatory variables. We used the estimated standard errors of the parameters to compare the efficiency of models.
Results
The logistic regression based on five primary variables produced unusual odds ratio estimates for age at first pregnancy (OR=67960, 95%CI: 10184-453503) and for total length of breast feeding (OR=0). On the other hand, the parameters estimated for logistic regression on latent variables generated by both factor analysis and principal components analysis were statistically significant (P<0.003). The standard errors were smaller than with ordinary logistic regression on original variables. The factors and components generated by the two methods explained at least 85% of the total variance.
Conclusions
This research showed that the standard errors of the estimated parameters in logistic regression based on latent variables were considerably smaller than that of model for original variables. Therefore models including latent variables could be more efficient when there is multicollinearity among the risk factors for breast cancer.
Language:
Persian
Published:
Iranian Journal of Epidemiology, Volume:1 Issue: 2, 2006
Page:
41
magiran.com/p409522  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
دسترسی سراسری کاربران دانشگاه پیام نور!
اعضای هیئت علمی و دانشجویان دانشگاه پیام نور در سراسر کشور، در صورت ثبت نام با ایمیل دانشگاهی، تا پایان فروردین ماه 1403 به مقالات سایت دسترسی خواهند داشت!
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!