Coronary artery disease (CAD) is caused by atherosclerosis in coronary arteries and results in cardiac arrest and heart attack. Angiography is one of the most accurate methods to diagnose Heart disease, it incurs high expenses and comes with side effects. Data mining is the extraction of hidden predictive information and unknown data, patterns, relationships and knowledge by exploring the large data sets which are difficult to find and detect with traditional statistical methods. One of the biggest problems that prevent pattern recognition from functioning rapidly and effectively are the noisy and inconsistent data in databases. The present study intends to provide a data preparation method based on clustering algorithms for diagnosis Coronary artery disease with higher efficiency and fewer errors. Materials: In this study, the data under investigation was collected from a number of 303 persons referring to the heart unit in one of Tehran-based hospitals within the time interval 2011 to 2013. It included 54 features. K-means algorithm is used for clustering based data preprocessing system for elimination of noisy and inconsistent data and Naïve Bayes, K nearest neighbor and Decision tree are used for classification. Another two feature subset selection methods for cleaning data are also used to make a comparison between clustering based method and attribute selection method. Rapid Miner Software was adopted to conduct this study.
Findings of this research indicated that the suggested model will have the highest efficiency, 90.91. According to the results, the proposed method of performance is highly successful compared to other results attained and seems effective for pattern recognition applications.
With these results, the proposed method can be used in the diagnosis of coronary artery disease