جستجوی مقالات مرتبط با کلیدواژه

ensemble clustering

در نشریات گروه برق

تکرار جستجوی کلیدواژه ensemble clustering در نشریات گروه فنی و مهندسی

تکرار جستجوی کلیدواژه ensemble clustering در مقالات مجلات علمی

انتخاب همه

ارائه روش مبتنی بر الگوریتم ژنتیک برای مسئله یافتن پایدارترین خوشه ها در خوشه بندی ترکیبی

نوید صمیمی، صمد نجاتیان، حمید پروین*، کرم الله باقری فرد، وحیده رضایی

فصلنامه پردازش علائم و داده ها، سال بیست و یکم شماره 3 (پیاپی 61، پاییز 1403)، صص 111 -136

خوشه بندی نقش حیاتی در روش های بازیابی اطلاعات برای سازمان دهی مجموعه های بزرگ، درون تعداد کمی خوشه معنادار دارد. یکی از مهم ترین انگیزه های استفاده از خوشه بندی، تعیین و آشکارکردن ساختار ذاتی و پنهان یک مجموعه داده است. کاربران انسانی به علت تفاوت در سلیقه و طرز تفکرات مختلف از کشف ساختار ذاتی و درونی مجموعه داده ای بزرگ متون ناتوان اند. الگوریتم های خوشه بندی ترکیبی چند الگوریتم خوشه بندی را با هم ترکیب می کنند تا در نهایت به یک سامانه کلی خوشه بندی برسند. روش های خوشه بندی ترکیبی برای یافتن راه های بهتری با استفاده از بیرون کشیدن اطلاعات از چندین افراز اولیه داده هاست. ازآنجاکه الگوریتم های خوشه بندی مختلف به نقاط مختلف داده نگاه می کنند، آن ها می توانند افراز های مختلفی را از این چنین داده هایی تولید کنند؛ با ترکیب افراز های به دست آمده از الگوریتم های مختلف، ایجاد یک افراز با کارایی بالا ممکن است، حتی اگر خوشه ها از هم بسیار متراکم باشند. در این مقاله، روشی جدید معرفی شده است که به جای استفاده از تمامی خوشه های اولیه تولیدشده، از پایدارترین آن ها که توسط شش روش مختلف تولید شده اند، استفاده می کند. برای انتخاب خوشه های پایدارتر از تابع توافقی مبتنی بر ماتریس هم بستگی استفاده می شود. انتخاب خوشه های پایدارتر بر اساس معیار پایداری خوشه مبتنی بر معیار فیشر انجام می گیرد و سپس خوشه های به دست آمده به وسیله الگوریتم ژنتیک مورد ارزیابی قرار می گیرد و طبق این الگوریتم پایدارترین خوشه ها انتخاب می شوند؛ درنهایت ماتریس هم بستگی به دست آمده از اجماع خوشه های بهینه، به عنوان یک ماتریس مشابهت در نظر گرفته می شود. یک الگوریتم خوشه بندی سلسله مراتبی به عنوان تابع جمع کننده نهایی در نظر گرفته می شود و ماتریس هم بستگی به دست آمده را به عنوان ورودی گرفته و خوشه بندی توافقی نهایی را برمی گرداند. نتایج تجربی روی چندین مجموعه داده نشان می دهد که روش پیشنهادی، خوشه های متنوع و با پایداری بالا تولید می کند. به طور مشخص، این روش در معیارهای NMI و ARI به ترتیب بهبودهای قابل توجهی به میزان 12٪ و 5٪ نسبت به بهترین روش های پیشین به دست آورده است. این نشان دهنده برتری روش خوشه بندی ترکیبی پیشنهادی مبتنی بر پایداری خوشه و الگوریتم های ژنتیک است.

کلید واژگان: خوشه بندی ترکیبی، پایداری خوشه، معیار فیشر، ماتریس هم بستگی، الگوریتم ژنتیک

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Presenting a Method based on Genetic Algorithm for finding the most Stable Clusters in Ensemble Clustering

Navid Samimi, Samad Nejatian, Hamid Parvin*, Karamolah Bagheri Fard, Vahideh Rezaei

Signal and Data Processing, Volume:21 Issue: 3, 2024, PP 111 -136

Clustering is one of the fundamental tools in data analysis and data mining, enabling the extraction of hidden and meaningful structures from large datasets by grouping data based on intrinsic similarities. However, selecting optimal clusters in conventional clustering algorithms poses challenges, especially when clusters are dense or heterogeneous. In this study, a novel genetic algorithm-based method is proposed to identify the most stable clusters in ensemble clustering. By leveraging cluster stability criteria and a correlation matrix, the proposed approach improves the accuracy and stability of the final clustering results. The proposed method involves generating initial partitions of the data using six different clustering algorithms. Next, the Fisher criterion is applied to identify more stable clusters. These selected clusters are then evaluated and optimized using a genetic algorithm to construct an optimized correlation matrix. This matrix is subsequently fed into a hierarchical clustering algorithm, which produces the final consensus clustering. The proposed method was tested on standard datasets. Results demonstrated improvements of 12% and 5% in NMI and ARI metrics, respectively, compared to previous methods. The use of a genetic algorithm enabled the identification of clusters with higher stability and diversity, reducing the impact of noise and increasing the accuracy of the final clustering. Moreover, the method outperformed individual base clustering algorithms in providing more precise clustering results. Due to its ability to enhance the accuracy and stability of clustering, the proposed method holds potential for applications in domains such as big data analysis, machine learning, and information retrieval. The use of the Fisher criterion for selecting stable clusters and genetic algorithms for optimization are among the strengths of this research. This method not only preserves diversity among clusters but also significantly enhances clustering accuracy. Future studies could explore the combination of this approach with more advanced algorithms to assess its applicability to more complex datasets.

Keywords: Ensemble Clustering, Cluster Stability, Fisher Criterion, Correlation Matrix, Genetic Algorithm

Abstract View Paper Research/Original Article Original: Persian
ارائه رویکردی جدید برای تشخیص حملات علیه صدا از طریق پروتکل اینترنت مبتنی بر خوشه بندی تجمیعی

فرید باوی فرد، محمد خیراندیش*، محمد مصلح

نشریه روش های هوشمند در صنعت برق، پیاپی 62 (تابستان 1404)، صص 45 -66

با توجه به هزینه کمتر و انعطاف پذیری بیشتر، انتقال صدا از طریق پروتکل اینترنت (VoIP) به طور گسترده ای در ارتباطات راه دور استفاده می شود. تنوع پایانه های VoIP باعث آسیب پذیری آنها می شود. یک راه متداول برای ایمن سازی VoIP، شامل تشخیص نفوذ مبتنی بر یادگیری ماشین است. با توجه به تنوع ترافیک و عدم وجود برچسب کلاس برای آموزش سیستم های تشخیص نفوذ (IDS) در بسیاری از مواقع، بر رویکردهای خوشه بندی (یادگیری بدون ناظر) متمرکز شده اند. اما سیستم های خوشه بندی منفرد نمی توانند تنوع مقادیر ویژگی ها را به خوبی پوشش دهند و برخی از نمونه های ترافیک ممکن است به عنوان نقاط پرت شناسایی شوند. مدل پیشنهادی، به عنوان یک رویکرد تجمیعی برای حل این مسائل، روی استفاده از الگوریتم خوشه بندی دومرحله ای متمرکز شده و سعی می کند با ایجاد بهبودی در آن، فرآیند تشخیص نفوذ مبتنی بر خوشه بندی را بهبود دهد. علاوه بر این، با توجه به اهمیت فرآیند انتخاب ویژگی، ترکیبی از الگوریتم شبیه سازی تبرید (SA) و شبکه عصبی پرسپترون چندلایه (MLP)، برای شناسایی ویژگی های برتر مورد استفاده در خوشه بندی بسته های VoIP، در قالب بسته های عادی یا حمله انکار سرویس (DoS)، حمله کاربر به ریشه (U2R)، حمله کاربر از راه دور (R2L) و حمله پویش گر مورد بهره برداری قرار گرفته است. بر اساس نتایج ارزیابی بر روی مجموعه داده "آزمایشگاه امنیت شبکه- کشف دانش در پایگاه های داده ای" (NSL-KDD)، توسط نرم افزار متلب، انتخاب ویژگی پیشنهادی با کاهش ویژگی ها به 10 و 8، زمان آموزش و آزمایش را به ترتیب 77 درصد و 80 درصد کاهش می دهد. همچنین در مقایسه با تعدادی از مطالعات قبلی، IDS پیشنهادی بهبود متوسطی معادل 34/3 درصد، 17/14 درصد و 87/32 درصد را به ترتیب در دقت، نرخ تشخیص و معیار F نشان می دهد.

کلید واژگان: الگوریتم بهینه سازی، انتخاب ویژگی، پرسپترون چندلایه، خوشه بندی تجمیعی، سیستم تشخیص نفوذ، شبیه سازی تبرید

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Presenting a New Approach for Detecting Attacks on Voice over Internet Protocol Based on Ensemble Clustering

Farid Bavifard, Mohammad Kheyrandish *, Mohammad Mosleh

Journal of Intelligent Procedures in Electrical Technology, Volume:16 Issue: 62, Summer 2025, PP 45 -66

Due to lower cost and greater flexibility, voice over internet protocol (VoIP) is widely used in telecommunications. A variety of VoIP terminals causes them to be vulnerable. A common way to secure VoIP includes intrusion detection based on machine learning. Due to the diversity of traffics and lack of class labels for training Intrusion detection systems (IDSs) in many situations, clustering approaches (unsupervised learning) have been focused on. But individual cluster systems can't cover the diversities of feature values well, and some traffic samples may be identified as outliers. As an ensemble approach, the proposed model for solving these problems focuses on using TwoStep clustering algorithm, and by improving it, tries to improve the clustering-based intrusion detection. Moreover, regarding the importance of the feature selection process, a combination of Simulated Annealing algorithm (SA) and Multi-Layer Perceptron (MLP) has been exploited for identifying superior features used for clustering VoIP packets, as Normal or involving DoS, R2L, U2R either Probe attacks. Based on evaluation results obtained on the dataset “Network Security Lab-Knwledge Discovery in Databases” (NSL-KDD) by MATLAB, the proposed feature selection reduced the training and testing times, averagely by 77% and 80%, respectively, by reducing the features to 10 and 8. Also, compared to previous works, the proposed IDS shows average improvements in Accuracy, Detection rate, and F-Measure at 3.34 %, 14.17 %, and 32.87 %, respectively.

Keywords: Ensemble Clustering, Feature Selection, Intrusion Detection System, Multi-Layer Perceptron, Optimization Algorithm, Simulated Annealing

Abstract View Paper Research/Original Article Original: Persian
یک ترکیب نوآورانه از قطعه بندی، خوشه بندی ترکیبی و الگوریتم ژنتیک به منظور خوشه بندی سری های زمانی

زهرا قربانی، علی قربانیان*

مجله هوش مصنوعی و داده کاوی، سال دوازدهم شماره 2 (Spring 2024)، صص 273 -286

مشاهده متن مقاله پژوهشی/اصیل زبان: انگلیسی

A Novel Combination of Segmentation, Ensemble Clustering and Genetic Algorithm for Clustering Time Series

Zahra Ghorbani, Ali Ghorbanian *

Journal of Artificial Intelligence and Data Mining, Volume:12 Issue: 2, Spring 2024, PP 273 -286

Increasing the accuracy of time-series clustering while reducing execution time is a primary challenge in the field of time-series clustering. Researchers have recently applied approaches, such as the development of distance metrics and dimensionality reduction, to address this challenge. However, using segmentation and ensemble clustering to solve this issue is a key aspect that has received less attention in previous research. In this study, an algorithm based on the selection and combination of the best segments created from a time-series dataset was developed. In the first step, the dataset was divided into segments of equal lengths. In the second step, each segment is clustered using a hierarchical clustering algorithm. In the third step, a genetic algorithm selects different segments and combines them using combinatorial clustering. The resulting clustering of the selected segments was selected as the final dataset clustering. At this stage, an internal clustering criterion evaluates and sorts the produced solutions. The proposed algorithm was executed on 82 different datasets in 10 repetitions. The results of the algorithm indicated an increase in the clustering efficiency of 3.07%, reaching a value of 67.40. The obtained results were evaluated based on the length of the time series and the type of dataset. In addition, the results were assessed using statistical tests with the six algorithms existing in the literature.

Keywords: Time-Series Clustering, Ensemble Clustering, Segmentation, Genetic Algorithm

Abstract View Paper Research/Original Article Original: English
A Graph-based Online Feature Selection to Improve Detection of New Attacks

Hajar Dastanpour, Ali Fanian ∗

International Journal of Information Security, Volume:14 Issue: 2, Jul 2022, PP 115 -130

Today, intrusion detection systems are used in the networks as one of the essential methods to detect new attacks. Usually, these systems deal with a broad set of data and many features. Therefore, selecting proper features and benefitting from previously learned knowledge is suitable for efficiently detecting new attacks. A new graph-based method for online feature selection is proposed in this article to increase the accuracy in detecting attacks. In the proposed method, irrelevant features are first removed by inputting a limited number of instances. Then, features are clustered based on graph theory to reduce the search space. After the arrival of new instances at each stage, new clusters of features are created that may differ from the clusters created in the previous step. Therefore, to find the appropriate clusters, these two clusters are combined to select some relevant features with minimum redundancy. The evaluation results show that the proposed method has better performance, for instance classification with a lesser run time than similar online feature selection methods. The proposed method is also faster with a suitable accuracy in instances classification compared to some offline methods.

Keywords: Classification, Clustering, Ensemble Clustering, IntrusionDetection System, Online FeatureSelection

Abstract View Paper Research/Original Article Original: English
یک روش خوشه بندی ترکیبی جدید مبتنی بر خوشه بند cmeans فازی با حفظ تنوع در اجماع

فاطمه نجفی، حمید پروین*، کمال میرزایی، صمد نجاتیان، سیده وحیده رضایی

فصلنامه پردازش علائم و داده ها، سال هفدهم شماره 4 (پیاپی 46، زمستان 1399)، صص 103 -122

به علت بدون ناظر بودن مساله خوشه بندی، انتخاب یک الگوریتم خاص جهت خوشه بندی یک مجموعه ناشناس امری پر خطر و به طورمعمول شکست خورده است. به خاطر پیچیدگی مساله و ضعف روش های خوشه بندی پایه، امروزه بیش تر مطالعات به سمت روش های خوشه بندی ترکیبی هدایت شده است. در خوشه بندی ترکیبی ابتدا چندین خوشه بندی پایه تولید و سپس برای تجمیع آن ها، از یک تابع توافقی جهت ایجاد یک خوشه بندی نهایی استفاده می شود که بیشینه شباهت را به خوشه بندی های پایه داشته باشد. خوشه بندی توافقی تولید شده باید با استفاده از بیشترین اجماع و توافق به دست آمده باشد. ورودی تابع یادشده همه خوشه بندی های پایه و خروجی آن یک خوشه بندی به نام خوشه بندی توافقی است. در حقیقت روش های خوشه بندی ترکیبی با این شعار که ترکیب چندین مدل ضعیف بهتر از یک مدل قوی است، به میدان آمده اند. با این وجود، این ادعا درصورتی درست است که برخی شرایط همانند تنوع بین اعضای موجود در اجماع و کیفیت آن ها رعایت شده باشند. این مقاله یک روش خوشه بندی ترکیبی را ارایه داده که از روش خوشه بندی پایه ضعیف cmeans فازی به عنوان خوشه بند پایه استفاده کرده است. همچنین با اتخاذ برخی تمهیدات، تنوع اجماع را بالا برده است. روش خوشه بندی ترکیبی پیشنهادی مزیت الگوریتم خوشه بندی cmeans فازی را که سرعت آن است، دارد و همچنین ضعف های عمده آن را که عدم قابلیت کشف خوشه های غیر کروی و غیر یکنواخت است، ندارد. در بخش مطالعات تجربی الگوریتم خوشه بندی ترکیبی پیشنهادی با سایر الگوریتم های خوشه بندی مختلف به روز و قوی بر روی مجموعه داده های مختلف آزموده و با یکدیگر مقایسه شده است. نتایج تجربی حاکی از برتری کارایی روش پیشنهادی نسبت به سایر الگوریتم های خوشه بندی به روز و قوی است.

کلید واژگان: یادگیری ترکیبی، خوشه بندی ترکیبی، الگوریتم خوشه بندی cmeans فازی، اعتبار داده ها

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

A new ensemble clustering method based on fuzzy cmeans clustering while maintaining diversity in ensemble

Fatemeh Najafi, Hamid Parvin*, Kamal Mirzaei, Samad Nejatiyan, Seyede Vahideh Rezaie

Signal and Data Processing, Volume:17 Issue: 4, 2021, PP 103 -122

An ensemble clustering has been considered as one of the research approaches in data mining, pattern recognition, machine learning and artificial intelligence over the last decade. In clustering, the combination first produces several bases clustering, and then, for their aggregation, a function is used to create a final cluster that is as similar as possible to all the cluster bundles. The input of this function is all base clusters and its output is a clustering called clustering agreement. This function is called an agreement function. Ensemble clustering has been proposed to increase efficiency, strong, reliability and clustering stability. Because of the lack of cluster monitoring, and the inadequacy of general-purpose base clustering algorithms on the other, a new approach called an ensemble clustering has been proposed in which it has been attempted to find an agreed cluster with the highest Consensus and agreement. In fact, ensemble clustering techniques with this slogan, the combination of several poorer models, is better than a strong model. However, this claim is correct if certain conditions (such as the diversity between the members in the consensus and their quality) are met. This article presents an ensemble clustering method. This paper uses the weak clustering method of fuzzy cmeans as a base cluster. Also, by adopting some measures, the diversity of consensus has increased. The proposed hybrid clustering method has the benefits of the clustering algorithm of fuzzy cmeans that has its speed, as well as the major weaknesses of the inability to detect non-spherical and non-uniform clusters. In the experimental results, we have tested the proposed ensemble clustering algorithm with different, up-to-date and robust clustering algorithms on the different data sets. Experimental results indicate the superiority of the proposed ensemble clustering method compared to other clustering algorithms to up-to-date and strong.

Keywords: Ensemble Learning, Ensemble Clustering, Fuzzy Cmeans Clustering Algorithm, Data Validity

Abstract View Paper Research/Original Article Original: Persian
روش یادگیری فعال غیر نظارتی با ابعاد بالا

وحید قاسمی*، محمد جوادیان، سعید باقری شورکی

مجله هوش مصنوعی و داده کاوی، سال هشتم شماره 3 (Summer 2020)، صص 391 -407

مشاهده متن مقاله پژوهشی/اصیل زبان: انگلیسی

High-Dimensional Unsupervised Active Learning Method

V. Ghasemi *, M. Javadian, S. Bagheri Shouraki

Journal of Artificial Intelligence and Data Mining, Volume:8 Issue: 3, Summer 2020, PP 391 -407

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the data points as one-dimensional ink drop patterns, in order to summarize the effects of all data points, and then applies a threshold on the resulting vectors. It is based on an ensemble clustering method which performs one-dimensional density partitioning to produce ensemble of clustering solutions. Then, it assigns a unique prime number to the data points that exist in each partition as their labels. Consequently, a combination is performed by multiplying the labels of every data point in order to produce the absolute labels. The data points with identical absolute labels are fallen into the same cluster. The hierarchical property of the algorithm is intended to cluster complex data by zooming in each already formed cluster to find further sub-clusters. The algorithm is verified using several synthetic and real-world datasets. The results show that the proposed method has a promising performance, compared to some well-known high-dimensional data clustering algorithms.

Keywords: Ensemble Clustering, High Dimensional Clustering, Hierarchical Clustering, Unsupervised Active Learning Method

Abstract View Paper Research/Original Article Original: English
انتخاب اعضای ترکیب در خوشه بندی ترکیبی با استفاده از رای گیری

علیرضا لطیفی پاکدهی، نگین دانشپور*

فصلنامه پردازش علائم و داده ها، سال پانزدهم شماره 4 (پیاپی 38، زمستان 1397)، صص 17 -30

خوشه بندی ترکیبی، به ترکیب نتایج حاصل از خوشه بندی های موجود می پردازد. پژوهش های دهه اخیر نشان می دهد، چنان چه به جای ترکیب همه خوشه بندی ها، تنها دست های از آن ها بر اساس کیفیت و تنوع انتخاب شوند، آن چه به عنوان خروجی خوشه بندی ترکیبی حاصل می شود، بسیار دقیق تر خواهد بود. این مقاله به ارائه یک روش جدید برای انتخاب خوشه بندی ها بر اساس دو معیار کیفیت و تنوع می پردازد. برای رسیدن به این منظور ابتدا خوشه بندی های مختلفی با استفاده از الگوریتم k-means ایجاد می شود که در هر بار اجرا، مقدار k یک عدد تصادفی است. در ادامه خوشه بندی هایی که به این نحو تولید شده اند، با استفاده از الگوریتم جدیدیکه براساس میزان شباهت بین خوشه بندی های مختلف عمل می کند، گروه بندی می شوند تا آن دسته از خوشه بندی هایی که به یکدیگر شبیه اند در یک دسته قرار گیرند؛ سپس از هر دسته، با استفاده از یک روش مبتنی بر رای گیری، با کیفیت ترین عضو آن برای ایجاد خوشه بندی ترکیبی انتخاب می شود. در این مقاله از سه تابع HPGA، CSPA و MCLA برای ترکیب خوشه بندی ها استفاده شده است. در انتها برای آزمایش این روش جدید از داده های واقعی موجود در پایگاه داده UCI استفاده شده است. نتایج نشان می دهد که روش جدید کارایی بیشتر و دقیق تری نسبت به روش های قبلی دارد.

کلید واژگان: خوشه بندی ترکیبی، انتخاب اعضا، شاخص های ارزیابی کیفیت

چکیده مشاهده متن زبان: فارسی

Cluster ensemble selection using voting

Alireza Latifi Pakdehi, Negin Daneshpour*

Signal and Data Processing, Volume:15 Issue: 4, 2019, PP 17 -30

Clustering is the process of division of a dataset into subsets that are called clusters, so that objects within a cluster are similar to each other and different from objects of the other clusters. So far, a lot of algorithms in different approaches have been created for the clustering. An effective choice (can combine) two or more of these algorithms for solving the clustering problem. Ensemble clustering combines results of existing clusterings to achieve better performance and higher accuracy. Instead of combining all of existing clusterings, recent decade researchers show, if only a set of clusterings is selected based on quality and diversity, the result of ensemble clustering would be more accurate. This paper proposes a new method for ensemble clustering based on quality and diversity. For this purpose, firstly first we need a lot of different base clusterings to combine them. Different base clusterings are generated by k-means algorithm with random k in each execution. After the generation of base clusterings, they are put into different groups according to their similarities using a new grouping method. So that clusterings which are similar to each other are put together in one group. In this step, we use normalized mutual information (NMI) or adjusted rand index (ARI) for computing similarities and dissimilarities between the base clustering. Then from each group, a best qualified clustering is selected via a voting based method. In this method, Cluster-validity-indices were used to measure the quality of clustering. So that all members of the group are evaluated by the Cluster-validity-indices. In each group, clustering that optimizes the most number of Cluster-validity-indices is selected. Finally, consensus functions combine all selected clustering. Consensus function is an algorithm for combining existing clusterings to produce final clusters. In this paper, three consensus functions including CSPA, MCLA, and HGPA have used for combining clustering. To evaluate proposed method, real datasets from UCI repository have used. In experiment section, the proposed method is compared with the well-known and powerful existing methods. Experimental results demonstrate that proposed algorithm has better performance and higher accuracy than previous works.

Keywords: Ensemble clustering, select member, validity index

Abstract View Paper Regular Article Original: Persian

نکته

نتایج بر اساس تاریخ انتشار مرتب شده‌اند.
کلیدواژه مورد نظر شما تنها در فیلد کلیدواژگان مقالات جستجو شده‌است. به منظور حذف نتایج غیر مرتبط، جستجو تنها در مقالات مجلاتی انجام شده که با مجله ماخذ هم موضوع هستند.
در صورتی که می‌خواهید جستجو را در همه موضوعات و با شرایط دیگر تکرار کنید به صفحه جستجوی پیشرفته مجلات مراجعه کنید.

به جمع مشترکان مگیران بپیوندید!

ensemble clustering