data clustering
در نشریات گروه فنی و مهندسی-
امروزه، خوشه بندی داده ها به دلیل حجم و تنوع داده ها بسیار مورد توجه قرار گرفته است. مشکل اصلی روش های خوشه بندهای معمول این است که در دام بهینه محلی گرفتار می آیند. الگوریتم های فراابتکاری به دلیل داشتن توانایی فرار از بهینه های محلی، نتایج موفقی را در خوشه بندی داده ها نشان داده اند. الگوریتم بهینه سازی گرگ خاکستری از جمله این دسته الگوریتم ها است که قابلیت بهره برداری خوبی دارد و در برخی از مسایل راه حل مناسبی ارایه داده است، اما اکتشاف آن ضعیف است و در بعضی از مسایل به بهینه محلی همگرا می شود. در این تحقیق برای بهبود خوشه بندی داده ها، نسخه بهبودیافته ای از الگوریتم بهینه سازی گرگ خاکستری به نام الگوریتم بهینه سازی چهارگرگ خاکستری ارایه شده که با استفاده از بهترین موقعیت دسته چهارم گرگ ها به نام گرگ های امگای پیشرو در تغییر موقعیت هر گرگ، قابلیت اکتشاف بهبود می یابد. با محاسبه امتیاز هر گرگ نسبت به بهترین راه حل، نحوه حرکت آن مشخص می شود. نتایج الگوریتم پیشنهادی چهارگرگ خاکستری با الگوریتم های بهینه سازی گرگ خاکستری، بهینه سازی ازدحام ذرات، کلونی زنبور عسل مصنوعی، ارگانیسم های هم زیست و بهینه سازی ازدحام سالپ در مساله خوشه بندی روی چهارده مجموعه دادگان ارزیابی شده است. همچنین عملکرد الگوریتم پیشنهادی با چند نسخه بهبودیافته از الگوریتم گرگ خاکستری مقایسه شده است. نتایج به دست آمده عملکرد قابل توجه الگوریتم پیشنهادی را نسبت به سایر الگوریتم های فراابتکاری مورد مقایسه در مساله خوشه بندی نشان می دهد. بر اساس میانگین معیار F روی تمام مجموعه دادگان، روش پیشنهادی 82/172% و الگوریتم بهینه ذرات 78/284% را نشان می دهد و در مقایسه با نسخه های بهبودیافته الگوریتم گرگ، الگوریتم EGWO که در رتبه بعدی است دارای میانگین معیار F برابر 80/656% می باشد.
کلید واژگان: الگوریتم های فراابتکاری، الگوریتم بهینه سازی گرگ خاکستری، الگوریتم بهینه سازی چهارگرگ، خوشه بندیNowadays, clustering methods have received much attention because the volume and variety of data are increasing considerably.The main problem of classical clustering methods is that they easily fall into local optima. Meta-heuristic algorithms have shown good results in data clustering. They can search the problem space to find appropriate cluster centers. One of these algorithms is gray optimization wolf (GWO) algorithm. The GWO algorithm shows a good exploitation and obtains good solutions in some problems, but its disadvantage is poor exploration. As a result, the algorithm converges to local optima in some problems. In this study, an improved version of gray optimization wolf (GWO) algorithm called 4-gray wolf optimization (4GWO) algorithm is proposed for data clustering. In 4GWO, the exploration capability of GWO is improved, using the best position of the fourth group of wolves called scout omega wolves. The movement of each wolf is calculated based on its score. The better score is closer to the best solution and vice versa. The performance of 4GWO algorithm for the data clustering (4GWO-C) is compared with GWO, particle swarm optimization (PSO), artificial bee colony (ABC), symbiotic organisms search (SOS) and salp swarm algorithm (SSA) on fourteen datasets. Also, the efficiency of 4GWO-C is compared with several various GWO algorithms on these datasets. The results show a significant improvement of the proposed algorithm compared with other algorithms. Also, EGWO as an Improved GWO has the second rank among the different versions of GWO algorithms. The average of F-measure obtained by 4GWO-C is 82.172%; while, PSO-C as the second best algorithm provides 78.284% on all datasets.
Keywords: Data mining, data clustering, meta-heuristic algorithm, gray wolf optimization (GWO) algorithm, 4-gray wolf optimization (4GWO) algorithm, F-measure -
در بین تومورهای بالاتنه، تومورهای ریه عمدتا تحت تاثیر تنفس حرکت می کنند. برای بالا بردن دقت پرتودرمانی یک راه حل این است که حرکت تومور را از روی حرکت خارجی قفسه سینه و ناحیه شکمی تخمین بزنیم. برای این منظور، مدلهای پیش بین سازگاری برای ردیابی زمان واقعی تومور ساخته و استفاده می گردند. در این مدلها، خوشه بندی داده های استخراج شده از حرکت تومور و قفسه سینه تاثیر بسزایی روی عملکرد مدل دارند که در این تحقیق مورد توجه قرار گرفته اند. در این ارزیابی، داده حرکتی پانزده بیمار دارای تومور ریه که توسط سیستم پرتودرمانی سایبرنایف در مرکز پزشکی دانشگاه جرج تاون درمان شدند، مورد استفاده قرار گرفته است. دو استراتژی رایج و موجود با نامهای افتراقی وC میانگین فازی در خوشه بندی داده های حرکتی استفاده شده تا تاثیر کمی هر کدام بصورت مقایسه ای بررسی گردد. آنالیز نهایی نتایج نشان می دهد که مقدار میانگین خطای هدف گیری مدل پیش بین یعنی فاصله بین مکان پیش بینی شده توسط مدل و مکان واقعی تومور، روی همه بیماران با اعمال روش خوشه بندی C میانگین فازی و خوشه بندی افتراقی به ترتیب 5/6 و 5/7 میلیمتر می باشد. بعلاوه، ردیابی مدل با اعمال روش خوشه بندی C میانگین فازی با پایداری بیشتری همراه است. از آنجایی که پدیده تنفس دامنه تغییرات بسیار بالایی دارد ، خوشه بندی داده های حرکتی نقش مهمی روی دقت عمکلرد مدل پیش بین با تعیین پارامترهای مدل در حین ساخت آن پیش از درمان و به روزرسانی مدل در حین درمان دارد.
کلید واژگان: خوشه بندی، پرتودرمانی، تومور ریه، مدل پیش بین، منطق فازیAmong thorax tumors, lung tumors move mainly due to respiration. In order to enhance the precision of radiotherapy, one solution is estimating tumor motion from external motion of chest wall and abdomen regions. For this aim, consistent prediction models are constructed and then implemented for real time tumor motion tracking. In these models, clustering of database extracted from tumor motion and chest wall motion has non-negligible effect which has been taken into account in this work. In this investigation, motion database of fifteen patients with lung cancer who were treated by means of Cyberknife Synchrony system at Georgetown University hospital, has been used. Two subtractive and fuzzy C-means as common available clustering strategies have been employed in order to investigate their quantitative effects, in a comparative fashion. Final analyzed results show that the average targeting error of prediction models (difference between tumor position estimated by model and actual position of tumor) over all patients are 6.5 and 7.5 mm implementing subtractive and fuzzy C-means clustering, respectively. Moreover, using fuzzy C-means algorithm, tumor tracking is done with more stability. Since, breathing phenomena has high degree of variations, motion data clustering has an important role on the accuracy of prediction model performance by determining model parameters while constructing at pre-treatment step and while updating the model during the treatment.
Keywords: data clustering, Radiotherapy, lung tumor, Prediction model, Fuzzy logic -
Journal of Advances in Computer Engineering and Technology, Volume:6 Issue: 4, Autumn 2020, PP 227 -238
Clustering is a method of data analysis and one of the important methods in data mining that has been considered by researchers in many fields as well as in many disciplines. In this paper, we propose combining WOA with BA for data clustering. To assess the efficiency of the proposed method, it has been applied in data clustering. In the proposed method, first, by examining BA thoroughly, the weaknesses of this algorithm in exploitation and exploration are identified. The proposed method focuses on improving BA exploitation. Therefore, in the proposed method, instead of the random selection step, one solution is selected from the best solutions, and some of the dimensions of the position vector in BA are replaced We change some of the best solutions with the step of reducing the encircled mechanism and updating the WOA spiral, and finally, after selecting the best exploitation between the two stages of WOA exploitation and BA exploitation, the desired changes are applied on solutions. We evaluate the performance of the proposed method in comparison with other meta-heuristic algorithms in the data clustering discussion using six datasets. The results of these experiments show that the proposed method is statistically much better than the standard BA and also the proposed method is better than the WOA. Overall, the proposed method was more robust and better than the Harmony Search Algorithm (HAS), Artificial Bee Colony (ABC), WOA and BA.
Keywords: Bat algorithm, Whale Optimization Algorithm, Data clustering, Optimization -
Data clustering is an ideal way of working with a huge amount of data and looking for a structure in the dataset. In other words, clustering is the classification of the same data; the similarity among the data in a cluster is maximum and the similarity among the data in the different clusters is minimal. The innovation of this paper is a clustering method based on the Crow Search Algorithm (CSA) and Opposition-based Learning (OBL). The CSA is one of the meat-heuristic algorithms that is difficult at the exploration and exploitation stage, and thus, the clustering problem is susceptible to initialization for centrality of the clusters. In the proposed model, the crows change their position based on the OBL method. The position of the crows is updated using OBL to find the best position for the cluster. To evaluate the performance of the proposed model, the experiments were performed on 8 datasets from the UCI repository and compared with seven different clustering algorithms. The results show that the proposed model is more accurate, more efficient, and more robust than other clustering algorithms. Also, the convergence of the proposed model is better than other algorithms.Keywords: Data clustering, Crow Search Algorithm, Opposition-based learning, centrality
-
In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one of the latest meta-heuristic algorithms, which has a simple structure and it is easy to implement. The purpose of Chaos embedded Cat Swarm Optimization (CCSO) algorithm is to replace random values by chaotic ones to offer a stable algorithm that can allow for reaching the global optima to a large extent and improve the algorithm’s convergence speed. The proposed algorithm has been compared to other heuristic algorithms on standard data sets from UCI repository, and the experimental results demonstrate that the proposed algorithm yields high performance for solving the data clustering problem.Keywords: Data clustering, K-means, Cat Swarm Optimization, Chaos theory.Keywords: Data clustering, K-means, Cat Swarm Optimization, Chaos theory
-
Journal of Advances in Computer Engineering and Technology, Volume:5 Issue: 2, Spring 2019, PP 93 -106Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper, a new hybrid data clustering approach which combines the modified krill herd and K-means algorithms, named as K-MKH, is proposed. K-MKH algorithm utilizes the power of quick convergence behaviour of K-means and efficient global exploration of Krill Herd and random phenomenon of Levy flight method. The Krill-herd algorithm is modified by incorporating Levy flight in to it to improve the global exploration. The proposed algorithm is tested on artificial and real life datasets. The simulation results are compared with other methods such as K-means, Particle Swarm Optimization (PSO), Original Krill Herd (KH), hybrid K-means and KH. Also the proposed algorithm is compared with other evolutionary algorithms such as hybrid modified cohort intelligence and K-means (K-MCI), Simulated Annealing (SA), Ant Colony Optimization (ACO), Genetic Algorithm (GA), Tabu Search (TS), Honey Bee Mating Optimization (HBMO) and K-means++. The comparison shows that the proposed algorithm improves the clustering results and has high convergence speed.Keywords: Data clustering, Krill Herd, Levy-flight distribution, K-means, Convergence rate
-
During the last years, increased competition among banks has caused many developments in banking experiences and technology, while leading to even more churning customers due to their desire of having the best services. Therefore, it is an extremely significant issue for the banks to identify churning customers and attract them to the banking system again. In order to tackle this issue, this paper proposes a novel personalized collaborating filtering recommendation approach joint with the user clustering technology. In the proposed approach, first the loyal customers are clustered by means of hybrid algorithm based on Particle Swarm Optimization (PSO) and K-means. The clusters of loyal customers are then used to identify the features of the churning customers. Finally, the list of appropriate banking services are recommended for the churning customers based on a collaborative filtering recommendation system. The recommendation system uses the information of loyal customers to offer appropriate services for the churning customers. The proposed intelligent approach was successfully applied to return the churning customers of an Iranian bank.Keywords: Customer churn, data clustering, recommender system, collaborative filtering, particle swarm optimization
-
In this paper, the problem of de-noising of an image contaminated with Additive White Gaussian Noise (AWGN) is studied. This subject is an open problem in signal processing for more than 50 years. Local methods suggested in recent years, have obtained better results than global methods. However by more intelligent training in such a way that first, important data is more effective for training, second, clustering in such way that training blocks lie in low-rank subspaces, we can design a dictionary applicable for image de-noising and obtain results near the state of the art local methods. In the present paper, we suggest a method based on global clustering of image constructing blocks. As the type of clustering plays an important role in clustering-based de-noising methods, we address two questions about the clustering. The first, which parts of the data should be considered for clustering? and the second, what data clustering method is suitable for de-noising.? Then clustering is exploited to learn an over complete dictionary. By obtaining sparse decomposition of the noisy image blocks in terms of the dictionary atoms, the de-noised version is achieved. In addition to our framework, 7 popular dictionary learning methods are simulated and compared. The results are compared based on two major factors: (1) de-noising performance and (2) execution time. Experimental results show that our dictionary learning framework outperforms its competitors in terms of both factors.Keywords: Image De, Noising, Data Clustering, Dictionary Learning, Histogram Equalization, Sparse Representation
- نتایج بر اساس تاریخ انتشار مرتب شدهاند.
- کلیدواژه مورد نظر شما تنها در فیلد کلیدواژگان مقالات جستجو شدهاست. به منظور حذف نتایج غیر مرتبط، جستجو تنها در مقالات مجلاتی انجام شده که با مجله ماخذ هم موضوع هستند.
- در صورتی که میخواهید جستجو را در همه موضوعات و با شرایط دیگر تکرار کنید به صفحه جستجوی پیشرفته مجلات مراجعه کنید.