Presenting a Method based on Genetic Algorithm for finding the most Stable Clusters in Ensemble Clustering

Author(s):

Navid Samimi , Samad Nejatian , Hamid Parvin* , Karamolah Bagheri Fard , Vahideh Rezaei

Message:

Article Type:

Research/Original Article (دارای رتبه معتبر)

Abstract:

Clustering is one of the fundamental tools in data analysis and data mining, enabling the extraction of hidden and meaningful structures from large datasets by grouping data based on intrinsic similarities. However, selecting optimal clusters in conventional clustering algorithms poses challenges, especially when clusters are dense or heterogeneous. In this study, a novel genetic algorithm-based method is proposed to identify the most stable clusters in ensemble clustering. By leveraging cluster stability criteria and a correlation matrix, the proposed approach improves the accuracy and stability of the final clustering results. The proposed method involves generating initial partitions of the data using six different clustering algorithms. Next, the Fisher criterion is applied to identify more stable clusters. These selected clusters are then evaluated and optimized using a genetic algorithm to construct an optimized correlation matrix. This matrix is subsequently fed into a hierarchical clustering algorithm, which produces the final consensus clustering. The proposed method was tested on standard datasets. Results demonstrated improvements of 12% and 5% in NMI and ARI metrics, respectively, compared to previous methods. The use of a genetic algorithm enabled the identification of clusters with higher stability and diversity, reducing the impact of noise and increasing the accuracy of the final clustering. Moreover, the method outperformed individual base clustering algorithms in providing more precise clustering results. Due to its ability to enhance the accuracy and stability of clustering, the proposed method holds potential for applications in domains such as big data analysis, machine learning, and information retrieval. The use of the Fisher criterion for selecting stable clusters and genetic algorithms for optimization are among the strengths of this research. This method not only preserves diversity among clusters but also significantly enhances clustering accuracy. Future studies could explore the combination of this approach with more advanced algorithms to assess its applicability to more complex datasets.

Keywords:

Ensemble Clustering , Cluster Stability , Fisher Criterion , Correlation Matrix , Genetic Algorithm

Language:

Persian

Published:

Signal and Data Processing, Volume:21 Issue: 3, 2024

Pages:

111 to 136

https://www.magiran.com/p2820534

دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:

اشتراک شخصی

با ثبت ایمیلتان و پرداخت حق اشتراک سالانه به مبلغ 1,950,000 ريال، بلافاصله متن این مقاله را دریافت کنید.اعتبار دانلود 70 مقاله نیز در حساب کاربری شما لحاظ خواهد شد.

پرداخت حق اشتراک به معنای پذیرش "شرایط خدمات" پایگاه مگیران از سوی شماست.

پست الکترونیکی

اگر مقاله ای از شما در مگیران نمایه شده، برای استفاده از اعتبار اهدایی سامانه نویسندگان با ایمیل منتشرشده ثبت نام کنید. ثبت نام

اشتراک سازمانی

به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!

اطلاعات بیشتر ثبت نام با ایمیل دانشگاهی/سازمانی

توجه!

حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.

In order to view content subscription is required

Personal subscription

Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.

Organization subscription

Please contact us to subscribe your university or library for unlimited access!

More information

علمی مصوب

فصلنامه پردازش علائم و داده ها

Signal and Data Processing

فصلنامه فنی مهندسی

آخرین شماره | آرشیو

ISSN: 2538-4201 eISSN: 2538-421X

صاحب امتیاز:

پژوهشگاه توسعه فناوری های پیشرفته خواجه نصیرالدین طوسی

مدیر مسئول:

دکتر جواد شیخ زادگان

سردبیر:

دکتر محمدحسن قاسمیان

تلفن نشریه: ۰۲۱-۸۳۸۵۷۶۰۵

اطلاعات بیشتر نشریه

درباره نشریه پیام به نشریه سایت اختصاصی نشریه پذیرش الکترونیکی مقاله

به جمع مشترکان مگیران بپیوندید!

Presenting a Method based on Genetic Algorithm for finding the most Stable Clusters in Ensemble Clustering

Navid Samimi , Samad Nejatian , Hamid Parvin* , Karamolah Bagheri Fard , Vahideh Rezaei

Ensemble Clustering , Cluster Stability , Fisher Criterion , Correlation Matrix , Genetic Algorithm

فصلنامه پردازش علائم و داده ها

Signal and Data Processing