به جمع مشترکان مگیران بپیوندید!

تنها با پرداخت 70 هزارتومان حق اشتراک سالانه به متن مقالات دسترسی داشته باشید و 100 مقاله را بدون هزینه دیگری دریافت کنید.

برای پرداخت حق اشتراک اگر عضو هستید وارد شوید در غیر این صورت حساب کاربری جدید ایجاد کنید

عضویت
جستجوی مقالات مرتبط با کلیدواژه

clustering

در نشریات گروه علوم پایه
  • مهدی رحمانی جوینانی، بنفشه حبیبیان دهکردی*
    رویکردهای یادگیری عمیق داده محور با چالش تولید داده هایی به تعداد زیاد و با کیفیت بالا و بار محاسباتی سنگین و زمان آموزش طولانی تحمیل شده توسط آن روبرو هستند. علاوه بر این، در صورتی که بعد از جداسازی تصادفی داده ها به سه مجموعه آموزش، اعتبارسنجی و آزمایش، توزیع آماری یکسانی برای آنها به دست نیاید، به دلیل رفتار نامنظم منحنی خطای آموزش و اعتبارسنجی، تعمیم پذیری خوبی حاصل نمی شود. در این پژوهش با استفاده از رویکرد مبتنی بر خوشه بندی اولیه داده ها و اختصاص درصد مشخصی از هر خوشه به سه مجموعه، و با پیمایش نتایج پیش بینی، کمینه داده مورد نیاز برای وارون سازی با رویکرد یادگیری عمیق ارائه می گردد. با اعمال آزمون های آماری نشان داده می شود که داده هایی که با این رویکرد جداسازی شده اند، دارای توزیع یکسان در سه مجموعه هستند. یک مدل یادگیری عمیق مبتنی بر معماری U-Net برای وارون سازی یک بعدی داده های مگنتوتلوریک آموزش داده می شود. به این منظور از یک مدل ژئوالکتریکی پنج لایه که شرایط یک میدان زمین گرمایی را شبیه سازی می کند، استفاده شده است. آموزش شبکه با تعداد متفاوت داده هایی که با روش گفته شده جداسازی شده اند، تکرار و عملکرد آن با معیارهای کمی و کیفی متفاوتی سنجیده می شود. با پیمایش نتایج وارون سازی با داده های آزمایشی یکسان بر مدل های آموزش دیده با درصد داده ای مختلف می توان بدون اینکه از دقت شبکه کاسته شود، به میزان 50 درصد تعداد داده های مورد نیاز برای آموزش مدل یادگیری عمیق و بنابراین زمان آموزش را کاهش داد. در مواجهه با داده های پیچیده تر، واقعی تر و نویزی قطعا جداسازی تصادفی رهیافت مناسبی برای تشکیل سه مجموعه نیست. هرچه شرایط پیچیده تر و تعداد ویژگی ها بیشتر باشد، جداسازی تصادفی راهکار نامناسب تری است؛ چراکه تفاوت توزیع های آماری سه مجموعه بیشتر می شود؛ و در نتیجه تعمیم پذیری کاهش و تعداد داده های مورد نیاز افزایش می یابد. در این صورت استفاده از خوشه بندی راهکار مناسبی برای یکسان سازی توزیع آماری سه مجموعه و کاهش تعداد داده هاست.
    کلید واژگان: خوشه بندی، مگنتوتلوریک، یادگیری عمیق، وارون سازی
    Mehdi Rahmani Jevinani, Banafsheh Habibian Dehkordi *
    Data-driven deep learning approaches have to deal with the challenge of generating large amounts of high-quality data, as well as the heavy computational cost and long training time imposed by it. Due to their ability to approximate complex nonlinear mapping functions, deep networks can be used effectively in geophysical inverse problems and better generalization can be achieved through deeper networks in many applications. In this research, an approach based on primary clustering of training data and assigning a certain percentage of each cluster to training, validation and test data has been used for data splitting. Kolmogorov Smirnov (KS) test has been applied to compare the distribution of three sets that are divided in this manner, and indicates that the training, validation and test data have the same distribution. A deep learning model based on modified U-Net architecture has been trained for one-dimensional inversion of magnetotelluric (MT) data, which is a highly non-linear regression problem. Supervised learning and back propagation error are used, and therefore, the inputs along with the corresponding outputs are given to the network in the form of training samples. For this purpose, a five-layer geoelectric model has been considered to simulate the conditions of a geothermal field. Using magnetotelluric forward modeling algorithm, the responses of this one-dimensional geoelectric model are analytically calculated in the frequency range of 0.01-100 Hz and in 13 frequencies that are uniformly distributed on a logarithmic scale, and total of 500000 sample data were generated. The thickness of the layers is variable and considered as part of the output. Pre-processing is done to scale the input and output variables before training and the network outputs are post-processed to be returned to the original interval. The mean square error (MSE) loss function and the Adam optimizer were used to train the network. Training is accomplished with a different amount of data separated by the mentioned method, and network performance is evaluated with some quantitative and qualitative criteria, including boxplots of Euclidean distance between true and predicted outputs and Nash Sutcliffe Efficiency coefficients. The trained network predicts the electrical resistivity and thickness of the layers from the new set of phase and apparent resistivity values. The results show that data splitting in this manner reduces the number of training data required to train the deep learning model by at least 50% without reducing the accuracy of the trained network. For noisy data and in more real scenarios, random separation is definitely not a suitable approach to form training, validation and test sets. In these conditions, the use of clustering is a suitable solution for equalizing the statistical distribution of the three sets and reducing the number of required data.
    Keywords: Clustering, Deep Learning, Inversion, Magnetotelluric
  • مریم آق آتابای*، فاطمه منچر
    دو زمین لرزه با بزرگای گشتاوری (Mw) بیش از 6 در تاریخ 11 تیر 1401  هجری شمسی در غرب بندر خمیر استان هرمزگان در جنوب ایران رخ دادند. کانون سطحی این زمین لرزه ها بر روی یک زون متراکم لرزه خیزی، با روند شمال شرق-جنوب غرب انتهای جنوب شرقی کمربند چین خورده رانده زاگرس در جنوب ایران قرار دارند. در این پژوهش، تغییرات زمانی پارامترهای فرکتالی لرزه خیزی شامل b-value و ابعاد فرکتالی مراکز سطحی و زمان رویداد زمین لرزه ها پیش از این زمین لرزه ها بررسی شد. داده ها در محدوده دایره ای شکل به مرکزیت کانون سطحی زمین لرزه اول و شعاع 60 کیلومتر برای یک دوره زمانی 5/3 ساله (از ابتدای 2019 تا زمان رویداد زمین لرزه ها) از مرکز لرزه نگاری ایران برگرفته شد. بر اساس نتایج این پژوهش، تا قبل از سال 2021 (5/1 سال پیش از رویداد زمین لرزه های هدف) پارامترهای فرکتالی تغییراتی متناسب با رخداد خوشه های لرزه ای متعدد این دوره زمانی نشان می دهند. بین سالهای 2021 و 2022 روند تغییرات دو پارامتر b-value  و Dt پس از یک افزایش قابل توجه و De نسبتا پایدار بوده است. سپس هر سه پارامتر لرزه خیزی در یک دوره زمانی چند ماهه پیش از زمین لرزه های اصلی الگوی مشابه (افزایش مقدار) نشان داده اند. بررسی نمودارهای نرخ ماهانه لرزه خیزی و بزرگا-زمان نشان داد که نرخ لرزه خیزی از اوایل سال 2021 تا قبل از زمان رویداد زمین لرزه های اصلی غرب بندر خمیر بسیار کم و زمین لرزه ها بصورت پراکنده در زمان رخ داده اند. اما نمودارمکان-زمان نشان می دهد که چند ماه قبل از وقوع زمین لرزه های اصلی یک دوره آرامش لرزه ای در اطراف زمین لرزه اصلی حاکم بوده و هم زمان با آن در شمال منطقه لرزه خیزی تداوم داشته است. نظر می رسد تغییرات معنادار پارامترهای فرکتالی لرزه خیزی در زمان حدود چند ماه پیش از رویداد زمین لرزه اصلی به دلیل آرامش لرزه ای حاکم بر منطقه است که می توان آن ها را به عنوان پیش نشانگر میان مدت زمین لرزه های 2022 غرب بندر خمیر استان هرمزگان معرفی کرد.
    کلید واژگان: پیش نشانگر، الگوی لرزه خیزی، خوشه بندی، زون گذر، زاگرس
    Maryam Agh-Atabai *, Fatemeh Manchar
    One of the goals of seismicity pattern studies is to find a precursor pattern prior to large earthquakes with the aim of their prediction. Foreshocks, doughnut pattern and seismic quiescence are seismic patterns that can be used as predictors in the short-term, medium-term and long-term period before earthquakes event (Mogi, 1985; Scholz, 1988). Researchers have shown that seismic parameters shows significant changes before the occurrence of earthquakes (Bayrak et al., 2017).  Investigating temporal changes of seismicity fractal parameters is one of the ways to find seismic pattern in periods before large earthquakes. In this research, the seismicity pattern before the 2022 Hormozgan (west of Bandar-e Khmir) earthquakes have been investigated using fractal methods. These earthquakes with moment magnitude (Mw) of more than 6 occurred on a dense seismic zone with a northeast-southwest trend at the southeast end of the Zagros fold-thrust belt. In this article, to investigate the precursory pattern, temporal changes of seismicity fractal parameters, including b-value, fractal dimension of earthquake epicenters, De, and fractal dimension of earthquake occurrence times, Dt,  were studied in a 3.5-years period before 2022 Hormozgan earthquakes. The correlation integral method was used to calculate the spatial and temporal fractal dimensions (Grassberger and Procaccia, 1983). The data used in this research (a circular area centered on the epicenter of the first event with a radius of 60 km) was extracted from Iranian Seismological Center (IRSC). The completeness Magnitude, Mc, was calculated 2.9 using the frequency-Magnitude curve, therefore earthquakes smaller than 2.9 were excluded from catalogue for subsequent fractal calculations. In this article, a fixed window method with a length of one year and steps of three months was used to investigate the temporal changes of the seismicity pattern. For each of the windows, three parameters b-value, De and Dt were calculated and graphs of their temporal changes were drawn. The results showed that until 2021 (1.5 years before the target earthquake event), the fractal parameters show changes corresponding to the occurrence of numerous earthquake clusters in this time period. Between the years 2021 and 2022, the change trend of two parameters b-value and Dt has been relatively stable after a significant increase. Then, all three seismic parameters have shown a similar pattern (increasing value) in a period of several months before the main events. The monthly seismicity rate histogram and magnitude time graph show that the seismicity rate from the beginning of 2021 until the 2022 Hormozgan earthquakes is very low and the distribution of earthquakes is scattered in time. But the space-time diagram shows that several months before the occurrence of the main earthquakes, there is a period of seismic quiescence around the main earthquake and at the same time, the seismicity continued in the north of the epicenter. It seems that the significant temporal changes of seismicity fractal parameters before the main earthquakes are due to the seismic quiescence, which can be considered as medium-term precursor of the 2022 Hormozgan earthquakes.
    Keywords: Precursor, Seismicity Pattern, Clustering, Transition Zone, Zagros
  • Saeideh Barkhordari Firozabadi, Seyed Abolfazl Shahzadeh Fazeli *, Jamal Zarepour Ahmadabadi, Seyed Mehdi S Karbassi

    Metaheuristics have proved highly effective in addressing optimization challenges. Various algorithms address the clustering problem to find optimal centers for the clusters. One of the disadvantages of some of these algorithms is stagnation in local optima, especially for big data. If this problem is not properly solved, the clustering process will suffer. This research introduces a new hybrid method by merging the capabilities of two metaheuristic algorithms: Harris hawks optimization algorithm (HHO) and slime mould algorithm (SMA). These metaheuristic methods are employed to determine the best location for the cluster centers. Optimization aims to reduce intra-cluster distance. In other words, the data points of each cluster should be close to its cluster center and also to avoid local optima. The effectiveness of these techniques is assessed and contrasted with the SMA and HHO algorithms on Iris, Vowel and Wine data sets. Compared to mentioned algorithms, our proposed method exhibits significantly improved convergence speed. The results also proved this method can properly find the optimal centers for clustering which finally improves the performance of the proposed method.

    Keywords: Clustering, Metaheuristic, Slime Mould Algorithm, Harris Hawks Optimization Algorithm
  • Zohreh Farhadi *, Mohadeseh Alsadat Farzammehr
    The performance of judiciary branches is evaluated based on specific indicators determined by the Statistics and Information Technology Center of Judiciary‎. ‎These indicators‎, ‎which are usually documents recorded in court cases‎, ‎have a specific administrative or judicial score for the branch‎, ‎and by calculating the total scores‎, ‎the performance of the branches is evaluated‎. ‎However‎, ‎with the expansion of these indicators‎, ‎ranking and evaluating branch performance has become more complex‎. ‎In this article‎, ‎clustering is used as one of the most important data mining tools to evaluate branch performance‎. ‎By identifying similar branches‎, ‎examining branches‎, ‎and facing upcoming challenges more effectively‎, ‎more effective decisions can be made in the judiciary system‎. ‎Here‎, ‎to organize 19 law branches based on 49 different administrative and judicial indicators‎, ‎the K-means clustering algorithm is applied based on two criteria of Euclidean dissimilarity distance and random forests‎. ‎In addition‎, ‎the Dunn index is used to evaluate clustering‎. ‎The value of this index is calculated as 0.82 by applying the dissimilarity of random forests‎, ‎indicating the successful performance of the algorithm used in determining similar branches.
    Keywords: Administrative Score, ‎Branch Performance Evaluation, ‎Clustering, ‎Judicial Score
  • Fatemeh Asadi, Hamzeh Torabi *, Hossein Nadeb
    The efficiency of Independent Component Analysis ($\rm ICA$) algorithms relies heavily on the choice of objective function and optimization algorithms. The design of objective functions for $\rm ICA$ algorithms necessitate a foundation built upon specific dependence criteria. This paper will investigate a general class of dependency criteria based on the copula density function. One of the aims of this study is to characterize the independence between two random variables and investigate their properties. Additionally, this paper introduces a novel algorithm for $\rm ICA$ based on estimators derived from the proposed criteria. To compare the performance of the proposed algorithm against existing methods, a Monte Carlo simulation-based approach was employed. The results of this simulation revealed significant improvements in the algorithm's outputs. Finally, the algorithm was tested on a batch of time series data related to the international tourism receipts index. It served as a pre-processing procedure within a hybrid clustering algorithm alongside ${\tt PAM}$. The obtained results demonstrated that the utilization of this algorithm led to improved performance in clustering countries based on their international tourism receipts index.
    Keywords: Amari Error, Clustering, Copula, Dependence Criteria, Mutual Information
  • Inderasan Naidoo

    This paper is the second in the series celebrating the mathematical works of Professor Themba Dube. In this sequel, we give prominence to Dube's pivotal contributions on pointfree convergence at the unstructured frame level, in the category of locales, and on his noteworthy conceptions on extensions and frame quotients. We distill and draw attention to particular studies of Dube on filters and his novel characterizations of certain conservative pointfree properties by filter and ultrafilter convergence, notably normality, almost realcompactness, and pseudocompactness. We also feature Dube's joint work on convergence and clustering of filters in Loc and coconvergence and coclustering of ideals in the category Frm.

    Keywords: Frame, Locale, Katˇetov Extension, Fomin Extension, Βl, Normal, Pseudocompact, Almost Realcompact, ˇcech-Complete, Quotient, Filter, Ultrafilter, Clustering, Convergence, Coconvergence, Coclustering
  • یحیی نیلوفری، بهمن سلیمانی*، علی کدخدائی، عبدالله چوگل

    تعیین الکتروفاسیسهای مخزنی نقش مهمی در ارزیابی پتروفیزیکی زونهای یک مخزن بمنظور بهرهبرداری بهینه از مخازن و توسعه میادین نفتی دارد. الکتروفاسیس بر مبنای خوشه بندی داده ها تعریف می شود، که بر مبنای خوشه بندی نمودارهای پتروفیزیکی مشابه در گروه-های یکسان و تمایز آنها از سایر گروه ها می باشد. پژوهش حاضر در سازند آسماری میدان نفتی قلعه نار صورت پذیرفته است. در ابتدا با استفاده از روش های مختلف خوشه سازی نظیر SOM، MRGC و DYNCLUST در تعدادی از چاه های میدان، مدل اولیه الکتروفاسیس ها تعیین گردید. الکتروفاسیس های تعیین شده با واحد های جریانی حاصل از تخلخل و تراوایی نمودارمغزه تطابق داده شد. از بین آنها روش SOM که دارای بیشترین تطابق بود جهت خوشه سازی انتخاب گردید. الکتروفاسیس ها بر اساس پارامترهایی از قبیل نمودارهای تخلخل و گاما ایجاد شده و به کل میدان بسط داده شد و در نتیجه مدلی ایجاد گردید که توانایی جدایش بخش های مختلف مخزنی را از همدیگر دارا بود. این مدل نشان داد که زون های 1 و 3 دارای کیفیت مخزنی مطلوبی است و زون 4 نیز دارای کیفیت متوسط تا خوب می باشد، اما زون های 2 و5 شرایط نا مطلوبی را دارا هستند.

    کلید واژگان: الکتروفاسیس، مخزن آسماری، خوشه سازی، شبکه عصبی خود سازمانده
    Yahya Nilofari, Bahman Soleimani *, Ali Kadkhodaie, Abdolah Chogol

    Electrofacies determination of the reservoir plays an important role in the petrophysical evaluation of reservoir zones to optimize production and development of oil fields. The process is based on data clustering that all unique petrophysical set are put in one group to separate from other groups. The present study was done in Asmari Formation, Ghaleh Nar oil field. The primary electrofacies model determined using different clustering methods such as SOM, MRGC, and DYNCLUST in several drilled wells. In the next step, they correlated with fluid units of porosity and permeability of core plot. Of these methods, SOM indicates more correlation and so it was selected to data clustering. According to Gamma and porosity plots, electrofacies were generated and developed to the whole of the field. This is resulted to a model with the potential of separation parts of the reservoir. The model showed that some parts of the reservoir especial zone 1 and zone 3 can be considered as more suitable reservoir quality than other parts. Zone 4 shows normal reservoir quality but two other zones are not in suitable reservoir condition.

    Keywords: Electrofacies, Asmari Reservoir, Clustering, Neural Self Organization Management
  • Ahmad Jalili *

    Wireless Sensor Networks (WSNs) encounter considerable challenges in terms of energy efficiency and network longevity due to their limited energy resources. This paper proposes a novel hybrid clustering-based routing protocol that addresses these challenges by integrating fuzzy logic for dynamic and adaptive cluster head (CH) selection based on residual energy, node degree, and proximity, and genetic algorithms (GA) for optimising cluster formation by balancing energy consumption and minimising communication distances. The protocol's objectives are threefold: to minimise energy consumption, extend network lifespan, and enhance Quality of Service (QoS).The proposed method was simulated in MATLAB and benchmarked against the LEACH and TEEN protocols. The results demonstrated the protocol's superior performance, achieving a 30% reduction in energy consumption, a 25% increase in network longevity, and higher data reliability. The primary factors contributing to this enhanced performance are the integrated use of fuzzy logic for optimised cluster head selection and genetic algorithms for optimal cluster formation. The findings substantiate the protocol's capacity to substantially enhance the energy efficiency and scalability of WSNs, providing a resilient and pragmatic solution for practical applications in real-world settings.

    Keywords: Wireless Sensor Networks, Clustering, Fuzzy Logic, Genetic Algorithms, Energy Efficiency
  • نرجس رضازاده مقدم، اصغر زمانی*

    خانواده شاه پسند (Verbenaceae)، در حال حاضر متشکل از دو جنس Verbena و  Phylaدر مناطق مختلف ایران، بخصوص استان های شمالی است. محدوده این خانواده در منابع قدیمی در مقایسه با مطالعات کنونی، دچار تغییرات قابل توجهی شده است. به عنوان مثال، دو جنس Vitex و Clerodendrum نیز در منابع قبلی در این خانواده تقسیم بندی شده اند. بر همین اساس، در این پژوهش، با بررسی صفات ریخت شناسی و تشریحی برگ در 20 نمونه از جنس های Verbena، Phyla، Vitex و Clerodendrum به ارزیابی ارتباط بین این جنس ها پرداخته شده است. بدین منظور 67 صفت کیفی و کمی ریخت شناسی و تشریحی مورد بررسی قرار گرفتند. جهت تحلیل داده ها از نرم افزار R نسخه 4.3.1 استفاده شد. به منظور تحلیل هم زمان داده های کمی و کیفی از روش آنالیز عاملی داده های ترکیبی (FAMD) استفاده شد. نتایج حاصل از این مطالعه حاکی از اهمیت بالای برخی صفات کمی مانند طول آوند چوب رگبرگ اصلی، عرض بذر، طول خامه و قطر پهنک و برخی صفات کیفی شامل شکل سلول های اپیدرمی پهنک، موقعیت روزنه، نحوه قرارگیری دو بازوی پهنک نسبت به هم، نوع کرک پوش، شکل سلول های اپیدرمی رگبرگ اصلی، رنگ گلبرگ، تعداد دستجات آوندی در پهنک، میزان تراکم کرک پوش، شکل حاشیه برگ و تعداد انشعابات ساقه در خوشه بندی جنس ها بود. به طورکلی، تحلیل داده ها بیانگر جدایی کامل این چهار جنس بود. بر اساس این مطالعه و در انطباق با نتایج داده های تبارزایی، جدایی دو جنس Clerodendrum و Vitex از خانواده شاه پسند تایید شد و تنها دو جنس Verbena و Phyla به عنوان جنس های بومی این خانواده در ایران شناخته می شوند.

    کلید واژگان: تشریح، خوشه بندی، روش FAMD، ریخت شناسی، نرم افزار R
    Narjes Rezazadeh Moghadam, Asghar Zamani*

    The family Verbenaceae, currently consists of Verbena and Phyla genera in different parts of Iran, especially northern provinces. The frame of this family has undergone significant changes in comparison with the previous references. For example, Vitex and Clerodendrum have been defined as the genera of this family in previous references. Accordingly, in this study, the relationship among these four genera has been evaluated using morphological and leaf anatomical characters of 20 samples. For this purpose, 67 qualitative and quantitative morphological and anatomical traits were used. The analysis of data was performed using R software ver. 4.3.1. For simultaneous analysis of quantitative and qualitative data, Factor Analysis of Mixed Data (FAMD) method was applied. The results of this study indicate the high value of some quantitative traits such as main vein xylem length, seed width, style length and blade width and some qualitative traits such as blade epidermal cells shape, stomata position, two arms of blade position in relation to each other, indumentum type, main vein epidermal cells shape, petal color, blade vascular bundles number, density of indumentum, leaf margin shape and stem branches length in the clustering of the genera. Totally, Analysis of data led to the separation of these genera. In accordance with the phylogenetic studies, Vitex and Clerodendrum show more affinity to each other and are separated from the currently native members of Verbenaceae in Iran, i.e. Verbena and Phyla.

    Keywords: Anatomy, Clustering, FAMD, Morphology, R Software
  • Density-Based clustering in mapReduce with guarantees on parallel time, space, and solution quality
    Sepideh Aghamolaei *, Mohammad Ghodsi
    A well-known clustering problem called Density-Based Spatial Clustering of Applications with Noise~(DBSCAN) involves computing the solutions of at least one disk range query per input point, computing the connected components of a graph, and bichromatic fixed-radius nearest neighbor. MapReduce class is a model where a sublinear number of machines, each with sublinear memory, run for a polylogarithmic number of parallel rounds. Most of these problems either require quadratic time in the sequential model or are hard to compute in a constant number of rounds in MapReduce. In the Euclidean plane, DBSCAN algorithms with near-linear time and a randomized parallel algorithm with a polylogarithmic number of rounds exist. We solve DBSCAN in the Euclidean plane in a constant number of rounds in MapReduce, assuming the minimum number of points in range queries is constant and each connected component fits inside the memory of a single machine and has a constant diameter.
    Keywords: Massively Parallel Algorithms, Range Searching, Unit Disk Graph, Near Neighbors, Clustering
  • F. Mohammadi, M. Sanei*, M. Rostamy-Malkhalifeh

    Cluster analysis in data envelopment analysis (DEA) is determining clusters for the units under evaluation regarding to their similarity. which measure of distances define their similarities. Over the years, researches have been carried out in the field of clustering of DMUs. In this paper, an algorithm for clustering units using projecting them on the frontier is presented. In fact, we gained for every decision making unit (DMU), nearest most productive scale size (MPSS) as target, to find number of clusters 2 method applied. Silhouette index was used to measure similarity value for our clustering. Numerical examples are provided to illustrate the proposed method and its results.

    Keywords: DEA, MPSS, Benchmarking (B.M), Clustering, Index Silhouette
  • Solmaz Yaghoubi, Rahman Farnoosh *
    This paper proposes an observation-driven finite mixture model for clustering high-dimension data. A simple algorithm using static hidden variables statically clusters the data into separate model components. The model accommodates normal and skew-normal distributed mixtures with time-varying component means, covariance matrices and skewness coefficient. These parameters are estimated using the EM algorithm and updated with the Generalized Autoregressive Scale (GAS) approach. Our proposed model is preferably clustered using a skew-normal distribution rather than a normal distribution when dealing with real data that may be skewed and asymmetrical. Finally, our proposed model will be evaluated using a simulation study and the results will be discussed using a real data set.
    Keywords: Clustering, Finite Mixture Model, Skew Normal Distribution, Generalized Autoregressive Score, Time Series
  • Mohammad Ordouei, Ali Broumandnia *, Touraj Banirostam, Alireza Gilani
    The smart city model on multi-agent systems and the Internet of Things using a wireless sensor network is designed to improve the quality of life for citizens, increase resource efficiency, and reduce costs. This model enables the collection, analysis, and sharing of information by connecting and coordinating devices and systems within the smart city. In this model, intelligent agents act as sensors, and the smart gateway plays the role of a base station. The main goal of this model is to reduce energy consumption. To achieve this goal, intelligent agents are divided into clusters, with each cluster having a cluster head. The cluster head’s task is to collect and aggregate information from the intelligent agents within its cluster and send it to the smart gateway. In the proposed method, each intelligent agent selects a cluster in a distributed manner. An intelligent agent may choose another intelligent agent as its cluster head or select itself as a cluster head and directly send the data to the smart gateway. Each intelligent agent chooses the cluster head after calculating the importance level of neighboring intelligent agents. By using this model, cities can experience increased resource efficiency and cost reduction by leveraging innovative technologies. The proposed method has been implemented in different scenarios of smart cities, such as sparse and crowded smart cities with varying message sizes. In all simulations, the proposed method demonstrated good capabilities in optimizing energy consumption management.
    Keywords: Smart City, Energy Consumption Management, Intelligent Agents, Clustering
  • Shahnaz Hatami *, Mohammad Hatami-B, Danial Kahrizi

    In the realm of agriculture and natural resources, medicinal plants stand out as a valuable resource. In recent years, faced with challenges such as predicting climate changes, soil classification, land use, and identifying patterns, there is a growing need for optimal techniques with higher efficiency, particularly in the cultivation of medicinal plants. Therefore, this article introduces the application of data mining to analyze available data in the agriculture and natural resources areas, focusing specifically on the medicinal plant industry. The primary objective is to explore data mining techniques that can enhance various aspects of medicinal plant cultivation, addressing challenges related to climate predictions, soil classification, and optimizing production. The article concludes by presenting the most effective data analysis methods in this domain, accompanied by their corresponding algorithms. Additionally, the aforementioned research is a guide for those intending to investigate the applications of data mining methods are highlighted for increased productivity, encompassing areas such as predicting crop yield, forecasting weather conditions, rainfall patterns, seed and plant conditions, soil quality, and medicinal plant production. The summarization and analysis of the outcome indicated that implementing AI could improve the design and process engineering strategies in bioprocessing fields.

    Keywords: Data Mining, Medicinal plants, Classification, Clustering
  • Abbas Ali Rezaee *, Hadis Ahmadian Yazdi, Mahdi Yousefzadeh Aghdam, Sahar Ghareii
    With the advancements in science and technology‎, ‎the industrial and aviation sectors have witnessed a significant increase in data‎. ‎A vast amount of data is generated and utilized continuously‎. ‎It is imperative to employ data mining techniques to extract and uncover knowledge from this data‎. ‎Data mining is a method that enables the extraction of valuable information and hidden relationships from datasets‎. ‎However‎, ‎the current aviation data presents challenges in effectively extracting knowledge due to its large volume and diverse structures‎. ‎Air Traffic Management (ATM) involves handling Big data‎, ‎which exceeds the capacity of conventional acquisition‎, ‎matching‎, ‎management‎, ‎and processing within a reasonable timeframe‎. ‎Aviation Big data exists in batch forms and streaming formats‎, ‎necessitating the utilization of parallel hardware and software‎, ‎as well as stream processing‎, ‎to extract meaningful insights‎. ‎Currently‎, ‎the map-reduce method is the prevailing model for processing Big data in the aviation industry‎. ‎This paper aims to analyze the evolving trends in aviation Big data processing methods‎, ‎followed by a comprehensive investigation and discussion of data analysis techniques‎. ‎We implement the map-reduce optimization of the K-Means algorithm in the Hadoop and Spark environments‎. ‎The K-Means map-reduce is a crucial and widely applied clustering method‎. ‎Finally‎, ‎we conduct a case study to analyze and compare aviation Big data related to air traffic management in the USA using the K-Means map-reduce approach in the Hadoop and Spark environments‎. ‎The analyzed dataset includes flight records‎. ‎The results demonstrate the suitability of this platform for aviation Big data‎, ‎considering the characteristics of the aviation dataset‎. ‎Furthermore‎, ‎this study presents the first application of the designed program for air traffic management‎.
    Keywords: Data Mining‎, ‎Air Traffic Management‎, ‎Clustering‎, ‎K-Means Algorithm‎, ‎Hadoop Platform‎, ‎Spark Platform Optimization
  • Mehran Soor *, Fatemeh Akhondi, Hadi Hedayati
    Gamma rays are the most energetic photons in the electromagnetic spectrum, detected with ground-based and space-based detectors in different energy ranges from sources in our galaxy and beyond. Gamma-ray point sources can be identified by special clustering of these photons. The minimum spanning tree (MST) algorithm is a graph-based method in order to find clusters. In this paper, we use the MST algorithm for finding point sources in Fermi gamma-ray space telescope data which is sensitive to photons with energies of 20 MeV up to more than 300 GeV. To this end, we selected eight completely random (10°×10°) fields of Fermi gamma-ray sky and tested the algorithm on the 12-year Fermi-LAT sky (Pass 8) at energy ranges above 3 GeV and above 6 GeV and with different cluster selection criteria. The calculation of Precision and Recall for both fields shows that MST is a useful algorithm in order to identify the point.
    Keywords: Astronomy data analysis, clustering, Gamma-ray sources
  • Afsaneh Esfandi, Ali Mehrafarin *, Sepideh Kalateh Jari, Hassanali Naghdi Badi, Kambiz Larijani
    The drying process can preserve herbal products against pathogens and improve their shelf life and quality; however, drying techniques have different effects on the appearance and quality of final products. Accordingly, the present study assessed various drying techniques viz. sunlight, shade, oven (45, 55, and 65 °C), vacuum (45, 55, and 65 °C), and microwave (20, 400, and 600 W) on color and phytochemicals characteristics of hemp (Cannabis sativa L.) plants with respect to total phenolic content (TPC), cannabidiol (CBD), and tetrahydrocannabinol (THC), chlorophyll (Chl) content, and color properties using multivariate analysis. The results revealed that the highest CBD and THC were observed in plants dried in a microwave at 400 and 600 W, respectively. The TPC reached the highest amount in shade drying conditions and was followed by microwave at 400 W, and oven at 45 °C. Although Chl b mainly remained unchanged, Chl a represented the lower amount by increasing the temperature of drying methods, especially over 65 °C. The lightness (L*) and brightness (b*) of fresh leaves were higher than dried samples, while over 65 °C possessed their minimum amount of L*. Agglomerative hierarchical clustering (AHC) showed three different clusters were determined as microwaves at 200, 400, and 600 W were placed in a distinguished cluster. Finally, this experiment suggested shade drying or minimum temperatures of the oven and vacuum techniques to reach constant color and phytochemicals, while microwaves can be recommended for CBD and THC, which can be useful in food and pharmacological industries.
    Keywords: Cannabidiol, Drying methods, Microwave, brightness, Clustering
  • Sahabul Alam, Joydeep Kundu, Shivnath Ghosh, Arindam Dey *
    Unmanned Aerial Vehicles (UAVs) bring both potential and difficulties for emergency applications, including packet loss and changes in network topology. UAVs are also quickly taking up a sizable portion of the airspace, allowing Flying Ad-hoc NETworks (FANETs) to conduct effective ad hoc missions. Therefore, building routing protocols for FANETs is difficult due to flight restrictions and changing topology. To solve these problems, a bio-inspired route selection technique is proposed for FANET. A combined trustworthy and bioinspired-based transmission strategy is developed as a result of the growing need for dynamic and adaptable communications in FANETs. The fitness theory is used to assess direct trust and evaluate credibility and activity to estimate indirect trust. In particular, assessing UAV behavior is still a crucial problem in this field. It recommends fuzzy logic, one of the most widely utilized techniques for trusted route computing, for this purpose. Fuzzy logic can manage complicated settings by classifying nodes based on various criteria. This method combines geocaching and unicasting, anticipating the location of intermediate UAVs using 3-D estimates. This method guarantees resilience, dependability, and an extended path lifetime, improving FANET performance noticeably. Two primary features of FANETs that shorten the route lifetime must be accommodated in routing. First, the collaborative nature necessitates communication and coordination between the flying nodes, which uses a lot of energy. Second, the flying nodes' highly dynamic mobility pattern in 3D space may cause link disconnection because of their potential dispersion. Using ant colony optimization, it employs trusted leader drone selection within the cluster and safe routing among leaders. a fuzzy‐based UAV behavior analytics is presented for trust management in FANETs. Compared to existing protocols, the simulated results demonstrate improvements in delay routing overhead in FANET.
    Keywords: Security, Clustering, Trust management, routing, Bio-inspired, Fuzzy
  • نجمه رضایی راد، مهناز خلفی*، محسن حسینعلی زاده، مجید عظیم محسنی

    تحلیل سری های زمانی - مکانی در علوم مختلف حائز اهمیت اما در عین حال چالش برانگیز است.  دقت تحلیل های سری های زمانی - مکانی به نحوه تبیین صحیح ارتباط در بعد زمان و مکان آنها بستگی دارد. در این مقاله، مولفه های اصلی دینامیکی یکطرفه جهت مدل سازی ساختار مشترک  سری های زمانی - مکانی معرفی و مورد استفاده قرار می گیرد. این مولفه های اصلی با مجموعه  داده هایی که شامل تعداد زیادی از سری های زمانی - مکانی است، قابل استفاده است. مولفه های اصلی دینامیکی علاوه بر ارتباط مکانی، تشخیص روند و روند فصلی،  انعکاس دهنده سایر عوامل مشترک زمانی و مکانی در مجموعه ای  از  سری های زمانی - مکانی هستند. جهت بررسی کارایی مولفه های اصلی دینامیکی یکطرفه،  از آنها برای خوشه بندی و پیش بینی در سری های زمانی - مکانی استفاده می شود.بر اساس سری های زمانی - مکانی بارندگی در ایستگاه های مختلف استان گلستان، کارایی مولفه های اصلی در خوشه بندی ایستگاه های هیدرومتری، مورد بررسی قرار می گیرد. همچنین پیش بینی بر اساس مولفه های اصلی دینامیکی یکطرفه برای مقادیر شاخص بارش استاندارد که یک شاخص مهم در بیان خشکسالی می باشد، انجام  می گیرد.

    کلید واژگان: مولفه های اصلی دینامیکی یکطرفه، همبستگی متقابل تعمیم یافته، سری زمانی-مکانی، خوشه بندی، شاخص بارش استاندارد
    Najmeh Rezaeerad, Mahnaz Khalafi*, Mohsen Hoseinalizadeh, Majid Azimmohseni

    The analysis of spatio-temporal series is crucial but a challenge in different sciences. Accurate analyses of spatio-temporal series depend on how to measure their spatial and temporal relation simultaneously. In this article, one-sided dynamic principal components (ODPC) for spatio-temporal series are introduced and used to model the common structure of their relation. These principal components can be used in the data set, including many spatio-temporal series. In addition to spatial relations, trends, and seasonal trends, the dynamic principal components reflect other common temporal and spatial factors in spatio-temporal series. In order to evaluate the capability of one-sided dynamic principal components, they are used for clustering and forecasting in spatio-temporal series. Based on the precipitation time series in different stations of Golestan province, the efficiency of the principal components in the clustering of hydrometric stations is investigated. Moreover, forecasting for the SPI index, an essential indicator for detecting drought, is conducted based on the one-sided principal components.

    Keywords: One-Sided Dynamic Principal Components, Generalized Cross Correlation, Space-Time Series, Clustering, Standardized Precipitation Index
  • S. Barkhordari Firozabadi, S.A. Shahzadeh Fazeli *, J. Zarepour Ahmadabadi, S.M. Karbassi
    The fuzzy-C-means (FCM) algorithm is one of the most famous fuzzy clus-tering algorithms, but it gets stuck in local optima. In addition, this algo-rithm requires the number of clusters. Also, the density-based spatial of the application with noise (DBSCAN) algorithm, which is a density-based clus-tering algorithm, unlike the FCM algorithm, should not be pre-numbered. If the clusters are specific and depend on the number of clusters, then it can determine the number of clusters. Another advantage of the DBSCAN clus-tering algorithm over FCM is its ability to cluster data of different shapes. In this paper, in order to overcome these limitations, a hybrid approach for clustering is proposed, which uses FCM and DBSCAN algorithms. In this method, the optimal number of clusters and the optimal location for the centers of the clusters are determined based on the changes that take place according to the data set in three phases by predicting the possibility of the problems stated in the FCM algorithm. With this improvement, the values of none of the initial parameters of the FCM algorithm are random, and in the first phase, it has been tried to replace these random values to the optimal in the FCM algorithm, which has a significant effect on the convergence of the algorithm because it helps to reduce iterations. The proposed method has been examined on the Iris flower and compared the results with basic FCM   algorithm and another algorithm. Results shows the better performance of the proposed method.
    Keywords: Clustering, Fuzzy clustering, DBSCAN
نکته
  • نتایج بر اساس تاریخ انتشار مرتب شده‌اند.
  • کلیدواژه مورد نظر شما تنها در فیلد کلیدواژگان مقالات جستجو شده‌است. به منظور حذف نتایج غیر مرتبط، جستجو تنها در مقالات مجلاتی انجام شده که با مجله ماخذ هم موضوع هستند.
  • در صورتی که می‌خواهید جستجو را در همه موضوعات و با شرایط دیگر تکرار کنید به صفحه جستجوی پیشرفته مجلات مراجعه کنید.
درخواست پشتیبانی - گزارش اشکال