جستجوی مقالات مرتبط با کلیدواژه

random forest algorithm

در نشریات گروه فنی و مهندسی

انتخاب همه

طراحی یک مدل یادگیری ماشین برای پیش بینی ویژگی های سنگ های مورداستفاده در ساخت موج شکن های توده سنگی

احسان شیخ صمد، اشکان مزدگیر، مریم عاملی*

نشریه مهندسی دریا، پیاپی 45 (زمستان 1403)، صص 57 -66

انتخاب معادن و همچنین سنگ باکیفیت مطابق با استانداردهای طراحی همیشه از مهم ترین چالش ها در جانمایی موج شکن ها و همچنین آغاز فعالیت های اجرایی پروژه های ساخت موج شکن های توده سنگی بوده است. این پژوهش بر مبنای استفاده از رویکردهای یادگیری ماشین جهت پیش بینی نتیجه پذیرش و یا عدم پذیرش سنگ های معادن با حداقل آزمایش های ممکن بر اساس خروجی الگوریتم پیشنهاد شده است. روش استفاده شده در این پایان نامه، ارائه چارچوبی شامل پیش پردازش کامل و دقیق داده ها و به کارگیری الگوریتم های یادگیری ماشین همچون درخت تصمیم، جنگل تصادفی و نزدیک ترین همسایه است. همچنین داده های استفاده شده در این پژوهش شامل نتایج آزمایش های سنگ ده سال اخیر است که در ساخت موج شکن های توده سنگی نوار ساحلی دریای عمان استفاده شده است. در این روش داده های موجود به دو بخش داده های اصلی و داده های تست طبقه بندی شده و الگوریتم بر روی داده های اصلی پیاده سازی و پس از آن خروجی الگوریتم با استفاده از داده های تست ارزیابی گردید بادقت 96 درصد خروجی الگوریتم مورد تایید قرار گرفت. نتایج به دست آمده از این چارچوب بر اهمیت استفاده از داده های موجود در صنعت ساخت سازه های دریایی و همچنین اثربخشی استفاده از الگوریتم های یادگیری ماشین در آنالیز و تحلیل داده های موجود تاکید دارد. خروجی نتایج پایان نامه حاضر باعث کاهش زمان انجام آزمایش ها، کاهش هزینه پروژه و کاهش مدت زمان انجام پروژه می گردد. بینش به دست آمده از پژوهش حاضر می تواند به شرکت های فعال در زمینه ساخت وساز سازه های دریایی و همچنین به طور خاص به سازمان بنادر و دریانوردی به عنوان متولی ساخت و نگهداری سازه های دریایی در کشور به جهت جانمایی موج شکن ها، بهینه سازی تخصیص منابع، کاهش زمان اجرایی و بهره برداری پروژه ها کمک نماید.

کلید واژگان: موج شکن های توده سنگی، الگوریتم درخت تصمیم، الگوریتم جنگل تصادفی، الگوریتم نزدیک ترین همسایه، یادگیری ماشین

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Designing a Machine Learning Model to Predict the Characteristics of Rocks Used in the Construction of Rock Mass Breakwaters

Ehsan Sheikhsamad, Ashkan Mozdgir, Mariam Ameli*

Journal of Marine Engineering, Volume:21 Issue: 45, Winter 2025, PP 57 -66

Selection of quarries and quality stones by design standards has always been one of the most important challenges in the location of breakwaters and the commencement of executive activities of rock mass breakwater construction projects. This research proposes using machine learning approaches to predict the result of the acceptance or rejection of quarry stones with the minimum possible tests based on the algorithm's output. The method used in this thesis is to provide a framework including complete and accurate data preprocessing and the use of machine learning algorithms such as decision tree, random forest, nearest neighbor, and the use of available data from the results of rock tests in the last ten years obtained in the construction of rock mass breakwaters in the coastal strip of the Sea of Oman. In this method, the available data is classified into two parts: main data and test data, and the algorithm is implemented on the main data and then the output of the algorithm is evaluated using the test data. The algorithm output is confirmed with an accuracy of 96%. The results obtained from this framework emphasize the importance of using existing data in the marine construction industry and the effectiveness of using machine learning algorithms in analyzing and interpreting existing data. The output of the results of this thesis reduces the time of experiments, reduces project costs, and reduces the duration of the project. The insight obtained from this research can help companies active in the field of marine construction and also specifically the Ports and Maritime Organization as the custodian of the construction and maintenance of marine structures in the country to locate breakwaters, optimize resource allocation, reduce the implementation time and operation of projects.

Keywords: Rock Mass Breakwaters, Decision Tree Algorithm, Random Forest Algorithm, Nearest Neighbor Algorithm, Machine Learning

Abstract View Paper Research/Original Article Original: Persian
طراحی یک جاذب فراسطحی تراهرتز بر پایه تکنیک یادگیری ماشین

محمدمهدی فخاریان*

نشریه مهندسی برق، سال پنجاه و چهارم شماره 3 (پیاپی 109، پاییز 1403)، صص 291 -299

توسعه جاذب های فراسطحی راه حل بالقوه ای برای دستیابی به وزن کم، ضخامت نازک، نرخ جذب مطلوب و ویژگی های قابل قبول جذب امواج تراهرتز ، ارائه می دهد. به منظور بهینه سازی خواص جذب فراسطح ها، معمولا از طیف جذب به عنوان یک معیار ارزیابی مهم استفاده میشود که می تواند بسیاری از ویژگی های مهم مانند مقدار جذب در فرکانس های مختلف را نشان دهد. اما، تحلیل طیف های جذب، به تعداد زیادی پارامترهای ساختاری وابسته است که منابع و زمان زیادی را مصرف می کند، زیرا جذب موج الکترومغناطیسی شامل فرآیندهای تطبیق امپدانس مختلط و تحریک میدان الکتریکی است. برای پرداختن به این موضوع، این تحقیق یک رویکرد یادگیری ماشین مبتنی بر الگوریتم جنگل تصادفی را برای پیش بینی نرخ جذب بر اساس پارامترهای ساختاری پیشنهاد می کند و نیاز به شبیه سازی عددی و زمان تجزیه و تحلیل طیف را کاهش می دهد. با مدل جنگل تصادفی، نرخ جذب با امتیاز R2 بیش از 99/0پیش بینی می شود. علاوه بر این، طرح جاذب پیشنهادی دارای مزایای نازک بودن، غیرحساس بودن به پلاریزاسیون و با زاویه برخورد نسبتا پایدار به واسطه تقارن ساختار است. این مطالعه یک رویکرد عملی و موثر برای طراحی سیستم های پیچیده مرتبط با انتشار موج الکترومغناطیسی جاذب، بازتاب و انتقال ارائه می کند.

کلید واژگان: جاذب، طراحی فراسطح، تراهرتز، یادگیری ماشین، الگوریتم جنگل تصادفی

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Design of a Terahertz Metasurface Absorber Based on Machine Learning Technique

M. M. Fakharian *

Journal of Electrical Engineering, Volume:54 Issue: 3, 2024, PP 291 -299

The development of metasurface absorbers offers a potential solution to achieve low weight, thin thickness, favorable absorption rate, and acceptable terahertz absorption characteristics. In order to optimize the absorption properties of metasurfaces, the absorption spectrum is usually used as an important evaluation criterion, which can show many important characteristics such as the rate of absorption at different frequencies. However, the analysis of absorption spectra related to a large number of variable structural parameters is required when designing the structure, which consumes a lot of resources and time, because electromagnetic wave absorption involves the processes of complex impedance matching and electric field excitation. To address this issue, this study proposes a machine learning approach based on a random forest algorithm to predict absorption rates based on structural parameters, reducing the need for numerical simulation and spectrum analysis time. With the random forest model, the absorption rate is predicted with the R2 score of more than 0.99. In addition, the proposed absorber design has the advantages of being thin, insensitive to polarization and with a relatively stable incident angle, due to the symmetry of the structure. This study presents a practical and effective approach for the design of complex systems related to absorbing, reflecting and transmitting electromagnetic wave propagation.

Keywords: Absorber, Metasurface Design, Terahertz, Machine Learning, Random Forest Algorithm

Abstract View Paper Research/Original Article Original: Persian
بررسی میزان افت انرژی جریان در سرریزهای زیگزاگی با استفاده از روش های مبتنی بر محاسبات نرم

حمیدرضا عباس زاده، رضا تاری نژاد*

نشریه مهندسی عمران و محیط زیست دانشگاه تبریز، سال پنجاه و چهارم شماره 3 (پیاپی 116، پاییز 1403)، صص 53 -64

هدف از پژوهش حاضر بررسی میزان افت انرژی نسبی (EDR) در سرریزهای کنگره ای با پلان مثلثی و ذوزنقه ای در ابعاد مختلف با استفاده از مدل ماشین بردار پشتیبان (SVM)، الگوریتم جنگل تصادفی (RF) و روش شبکه عصبی مصنوعی (ANN) است. از مجموعه داده های آزمایشگاهی 70% برای مرحله آموزش و 30% برای مرحله آزمون مورد استفاده قرار گرفتند. در مدل SVM، نتایج کرنل های مختلف نشان داد که کرنل تابع پایه شعاعی (RBF) نتایج بهتری در پیش بینی افت انرژی نسبی سرریز کنگره ای در مقایسه با کرنل های چندجمله ای (Polynomial)، خطی (Linear) و سیگموئید (Sigmoid) دارد. نتایج شاخص های آماری ضریب همبستگی (R)، میانگین درصد خطای نسبی (Mean RE%)، خطای جذر میانگین مربعات (RMSE) و شاخص کلینگ گوپتا (KGE) برای مدل SVM-RBF در مرحله آزمون به ترتیب 907/0، 38/1%، 0153/0 و 744/0 است. در روش ANN شبکه چند لایه پرسپترون (MLP) نتایج دقیق تری در مقایسه با شبکه RBF دارد. نتایج شاخص های فوق در مرحله آزمون برای روش ANN-MLP به ترتیب 969/0، 73/0%، 007/0 و 968/0 است. همچنین این نتایج برای مدل RF به ترتیب 878/0، 78/1%، 0192/0 و 362/0 است. بررسی نتایج نشان داد که روش ANN عملکرد مطلوبی نسبت به سایر مدل های SVM و RF دارد.

کلید واژگان: سرریز زیگزاگی، افت انرژی، شبکه عصبی مصنوعی، ماشین بردار پشتیبان، الگوریتم جنگل تصادفی

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Investigating the Rate of Flow Energy Loss in Zigzag Weirs Using Methods Based on Soft Computing

Hamidreza Abbaszadeh, Reza Tarinejad *

Journal of Civil and Environmental Engineering University of Tabriz, Volume:54 Issue: 3, 2024, PP 53 -64

The purpose of this research is to investigate the amount of relative energy loss (EDR) in zigzag weirs with triangular and trapezoidal plans in different dimensions using Support Vector Machine (SVM) model, Random Forest (RF) algorithm, and Artificial Neural Network (ANN) method. 70% of the experimental data sets were used for the training phase and 30% for the test phase. In the SVM model, the results of different kernels showed that the Radial Basis Function (RBF) kernel has better results in predicting the relative energy loss of zigzag weirs compared to the Polynomial, Linear, and Sigmoid kernels. The results of statistical indicators of correlation coefficient (R), percentage Mean Relative Error (Mean RE%), Root Mean Square Error (RMSE), and Kling Gupta Efficiency (KGE) for the SVM-RBF model in the test phase are 0.907, 1.38%, 0.0153, and 0.744, respectively. In the ANN method, the Multi-Layer Perceptron (MLP) network has more accurate results compared to the RBF network. The results of the above indicators in the test phase for the ANN-MLP method are 0.969, 0.73%, 0.007, and 0.968, respectively. In addition, these results for the RF model are 0.878, 1.78%, 0.0192, and 0.362, respectively. Examining the results showed that the ANN method performs better than other SVM and RF models.

Keywords: Zigzag Weir, Energy Loss, Artificial Neural Network, Support Vector Machine, Random Forest Algorithm

Abstract View Paper Research/Original Article Original: Persian
Evaluation of Band Ratio Technique for Prediction of Iron-Titanium Mineralization Using Ensemble Machine Learning Model: A Case Study from Khamal area, Western Saudi Arabia

Ahmed Madani *

Journal of Mining and Environement, Volume:15 Issue: 4, Autumn 2024, PP 1357 -1371

Innovation in mineral exploration occurs either in the construction of new ore deposit models or the development of new techniques used to locate the ore deposits. Band ratio is the image processing technique developed for mineral exploration. The present study presents a new approach used to evaluate the band ratio technique for discrimination and prediction of the Iron-Titanium mineralization exposed in the Khamal area, Western Saudi Arabia using the ensemble Random Forest model (RF) and SPOT-5 satellite data. SPOT-5 band ratio images are prepared and used as the explanatory variables. The target variable is prepared in which (70%) of the target locations are used for training and the rest are for validation. A confusion matrix and the precision-recall curves are constructed to evaluate the RF model performance and the Receiver Operating Characteristics curves (ROC) are used to rank the band ratio images. Results revealed that the 3/1, 2/1 & 3/2 band ratio images show excellent discrimination with AUC values of 0.986, 0.980 & 0.919 respectively. The present study successfully selects the 3/1 band ratio image as the best classifier and presents a new Fe-Ti mineralization image map. The present study proved the usefulness of the Random Forest classifier for the prediction of the Fe-Ti mineralization with an accuracy of 97%.

Keywords: AI-Based Predictive Model, Random Forest Algorithm, SPOT-5 Data, Fe-Ti Mineralization, Western Saudi Arabia

Abstract View Paper Research/Original Article Original: English
مدل تشخیص نفوذ در خانه های هوشمند مبتنی بر تحلیل مولفه اصلی و دسته بندی جنگل تصادفی

علی اکبر تجری سیاه مرزکوه*

فصلنامه پدافند الکترونیکی و سایبری، سال دوازدهم شماره 2 (پیاپی 46، تابستان 1403)، صص 15 -25

در سال های اخیر، مسئله حفظ امنیت خانه های هوشمند که در آن، تعداد زیادی از وسایل برای برقراری ارتباط از اتصالات اینترنت استفاده می کنند به یکی از دغدغه های اصلی در حوزه امنیت شبکه تبدیل شده است. اگرچه تاکنون پژوهش های زیادی در جهت برقراری امنیت خانه های هوشمند انجام شده است، اما باتوجه به گستردگی موضوع موردبحث، اغلب این کارها در مواردی از جمله دقت و سرعت عمل، کار آیی لازم را ندارند. در روش پیشنهادی پس از انجام برخی عملیات پیش پردازش روی مجموعه داده، به کمک تحلیل مولفه اصلی (PCA)، زیرمجموعه ای از ویژگی های مجموعه داده که به عنوان موثرترین ویژگی ها در تشخیص نفوذ به شمار می - آیند برای آماده سازی داده ها جهت دسته بندی انتخاب شده اند که این عمل منجر به افزایش دقت و سرعت عمل دسته بندی می شود. همچنین در مرحله دسته بندی از الگوریتم جنگل تصادفی که یک الگوریتم قدرتمند مبتنی بر یادگیری ماشین است بر روی یک مجموعه داده بسیار جدید اینترنت اشیا، به نام IoTID20 استفاده شده است. رویکرد پیشنهادی عملکرد بالایی برای تشخیص نفوذ بادقت %99.73 و %98.46 برای دسته بندی حملات دودویی و چند کلاسه نشان داده است. مقایسه ی نتایج روش پیشنهادی با سایر کارهای انجام شده، نشان دهنده ی برتری روش پیشنهادی در تشخیص حملات چند کلاسه است.

کلید واژگان: خانه هوشمند، تشخیص نفوذ، تحلیل مولفه اصلی، الگوریتم جنگل تصادفی، مجموعه داده Iotid20

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Smart Home Intrusion Detection Model based on Principal Component Analysis and Random Forest Classification

Aliakbar Tajari Siahmarzkooh *

Journal of Electronic and Cyber Defense, Volume:12 Issue: 2, 2024, PP 15 -25

In recent years, the problem of maintaining the security of smart homes, where a large number of devices use Internet connections to communicate, has become one of the main concerns in the field of network security. Although a lot of research has been done to establish the security of smart homes, but considering the scope of the topic under discussion, most of these works do not have the necessary efficiency in terms of accuracy and speed of operation. In the proposed method, after performing some pre-processing operations on the dataset, with the help of Principal Component Analysis (PCA), a subset of the features of the dataset are selected to prepare the data for classification, which are the most effective features in intrusion detection. It is considered that this action leads to an increase in the accuracy and speed of the classification action. Also, in the classification stage, the random forest algorithm, which is a powerful algorithm based on machine learning, has been used on a very new dataset of the Internet of Things, called IoTID20. The proposed approach has shown high performance for intrusion detection with an accuracy of 99.73% and 98.46% for the classification of binary and multi-class attacks. Comparing the results of the proposed method with other works, it shows the superiority of the proposed method in detecting multi-class attacks.

Keywords: Smart Home, Intrusion Detection, Principal Component Analysis (PCA), Random Forest Algorithm, Iotid20 Dataset

Abstract View Paper Research/Original Article Original: Persian
ارائه مدلی مبتنی بر الگوریتم جنگل تصادفی و بهینه سازی جایا برای پیش بینی ریزش مشتریان بانکی

سپیده چهره*، علی سرآبادانی

نشریه مدیریت مهندسی و رایانش نرم، سال نهم شماره 2 (پیاپی 17، پاییز و زمستان 1402)، صص 132 -148

ریزش مشتری یک اصطلاح مالی است که به از دست دادن مشتری اشاره دارد؛ امروزه با توجه به تعداد زیاد بانک ها، ریزش مشتریان از یک بانک به بانک دیگر تبدیل به دغدغه جدی برای بانک های مختلف شده است. بنابراین در این مقاله که برای مشتریان یک بانک گردآوری شده است، می توان با توجه به رفتار و ویژگی های مشتریان قبل از وقوع ریزش، به شناسایی مشتریانی که احتمال ریزش بالایی دارند پرداخت و با ارائه پیشنهادهایی آن ها را حفظ نمود. در بازاریابی همه بر این امر توافق دارند که حفظ یک مشتری از جذب یک مشتری جدید بسیار کم هزینه تر است، از این رو این مقاله به معرفی فازهای مختلف رویکرد پیش بینی مشتری ریزشی با کمک یادگیری ماشین پرداخته است. روش پیشنهادی ترکیبی از الگوریتم های جنگل تصادفی و بهینه سازی جایا می باشد و ریزش مشتری را بر اساس ویژگی های مختلف مشتری مانند سن، جنسیت، جغرافیا و موارد دیگر پیش-بینی می کند. نتایج حاصل از مدل پیشنهادی در مقاله به ترتیب در معیارهای Precision ، Recall و Accuracy برابر مقادیر91.41 درصد، 95.66 درصدو 93.35 درصد می باشد.

کلید واژگان: الگوریتم جنگل تصادفی، بهینه سازی جایا، ریزش مشتری، یادگیری ماشینی

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

A model based on random forest algorithm and Jaya optimization to predict bank customer churn

Sepideh Chehreh *, Ali Sarabadani

Engineering Management and Soft Computing, Volume:9 Issue: 2, 2024, PP 132 -148

Customer churn is a financial term that refers to the loss of a customer; Today, due the large number of banks , the loss of customers from one bank to another has become a serious concern for different banks. Therefore, in this article, which has been compiled for the customers of a bank , it is possible to identify customers who have a high probability of falling by considering the behavior and characteristics of the customers before the fall occurs and to keep them by providing suggestions. In marketing, everyone agrees that keeping a customer is much less expensive than attracting a new customer, this article introduces the different phases of the approach of predicting customer churn with the help of machine learning. The proposed method is a combination of random forest algorithms and Jaya optimization, and customer dropout is based on different characteristics. Customer like age, Gender, graphs and cases It predicts more . The results of model in the article are 91.41%, 95.66% and 93.35% respectively in Precision , Recall and Accuracy criteria.

Keywords: customer churn, Machine Learning, random forest algorithm, site optimization

Abstract View Paper Research/Original Article Original: Persian
ارائه ی رویکردی نوین در بخش بندی تصاویر دیجیتال توسط الگوریتم ژنتیک و جنگل تصادفی

فریبا نمیرانیان، علی محمد لطیف*

فصلنامه پردازش علائم و داده ها، سال بیستم شماره 4 (پیاپی 58، زمستان 1402)، صص 35 -44

در این پژوهش رویکردی نوین برای بخش بندی تصویر بر اساس الگوریتم ژنتیک و جنگل تصادفی معرفی می گردد. در بخش بندی تصویر سعی می شود اجزاء مختلف تصویر از یکدیگر جدا شوند. در این فرایند به تمامی پیکسل های داخل تصویر برچسبی داده می شود؛ به نحوی که پیکسل های با برچسب یکسان ویژگی های مشترکی را داشته باشند. در روش پیشنهادی این ویژگی ها با استفاده از فیلترهای تصویری به دست آورده می شود. با ترکیب این ویژگی ها و با الگوریتم جنگل تصادفی به عنوان طبقه بند بخش بندی تصاویر انجام می شود. فیلترهای تصویری استفاده شده دارای تعدادی ابرپارامتر می باشند که تنظیم صحیح این ابرپارامترها بر کارایی الگوریتم موثر است. در این مقاله انتخاب این ابرپارامترها توسط الگوریتم ژنتیک انجام می شود. ابر پارامترهای فیلترهای گابور به عنوان ژن های کروموزوم الگوریتم ژنتیک در نظر گرفته می شود. تابع برازندگی f1-score حاصل از اجرای الگوریتم جنگل تصادفی برای بخش بندی تصویر تعریف می شود. یافتن مقادیر مناسب ابر پارامترهای فیلترهای گابور و افزایش f1-score در بخش بندی تصویر نسبت به سایر روش های مورد بررسی از دستاوردهای این پژوهش است.

کلید واژگان: بخش بندی تصویر، الگوریتم ژنتیک، الگوریتم جنگل تصادفی

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

A New Approach for Digital Image Segmentation with Genetic Algorithm and Random Forest

Fariba Namiranian, Ali mohammad Latif*

Signal and Data Processing, Volume:20 Issue: 4, 2024, PP 35 -44

In this research, a new approach for image segmentation based on genetic algorithm and random forest is presented. Image segmentation can be done using supervised learning. In this learning, there are a number of images from data set with their labels. In image segmentation, different parts of the image separate from each other. In this process, all the pixels in the image are given a label, so that the pixels with the same label have common characteristics with each other. To provide a model that can perform image segmentation, it is necessary to extract features from input images and perform segmentation using a suitable classifier and these features. Image feature extraction is done using image filters. In this research, a hybrid combination of 4 Gabor filter banks and Sobel, Prewitt, Canny edge, Scharr, Gaussian, median, and Roberts filters are used for effective feature extraction. One of the most important of these filters, which also has a degree of freedom, is the Gabor filter. This filter has a number of hyperparameters that change the efficiency of the classifier by changing these hyperparameters. In this research, an attempt has been made to adjust these hyperparameters using genetic algorithm. The fitness function proposed in this research is f1-score. random forest classifier is utilized for image segmentation and classification. The results of the experiments show that the hyperparameters found by the genetic algorithm have been able to perform a satisfactory segmentation on data set.

Keywords: Image Segmentation, Genetic Algorithm, Random Forest Algorithm

Abstract View Paper Research/Original Article Original: Persian
کاشف: تشخیص گر دو مرحله ای فایل های اجرایی بداندیش ویندوزی

احسان الله شقاقی*، رضا جلایی، محمدعلی جوادزاده

فصلنامه پدافند الکترونیکی و سایبری، سال دهم شماره 2 (پیاپی 38، تابستان 1401)، صص 141 -154

رشد روزافزون بدافزارها، از تهدیدات مهم حوزه سایبری است و تشخیص آن ها را همواره با چالش هایی همراه کرده است. فایل های اجرایی بداندیش ویندوزی از طریق دستکاری ویژگی های موجود در سرآیند آن ها و مبهم سازی رفتار خود، فعالیت های مخرب را در سطح سیستم عامل هدف و یا هر برنامه کاربردی دیگر انجام می دهند. تشخیص نمونه های مشکوک بداندیش از میان حجم انبوهی از نمونه های ورودی و همچنین کشف بدافزارهای جدید و ناشناخته از موضوعاتی است که همواره مورد تحقیق پژوهشگران است. در این پژوهش، روشی ترکیبی برای تعیین میزان بداندیش بودن فایل های اجرایی مشکوک پیشنهاد شده است. روش پیشنهادی کاشف، شامل دو ماژول ایستا، برای استخراج ویژگی های سرآیند فایل اجرایی، و دو ماژول رفتاری برای استخراج ویژگی هایی برای تولید امضا و مدل رفتاری بداندیش براساس روش های یادگیری ماشین است. هدف این پژوهش مشکوک یابی فایل های قابل اجرای ویندوزی از میان حجم انبوهی از فایل ها و تعیین میزان بداندیش بودن آن ها است. این روش، بدافزارها را بر اساس میزان احتمال بداندیش بودن اختصاص داده شده به هر فایل تشخیص می دهد. آزمایش ها، درصد بداندیشی شش نوع بدافزار را برای تشخیص گر مبتنی بر سرآیند فایل اجرایی، در بازه 62.7 تا 70 درصد، برای تشخیص گر مبتنی بر یارا، در بازه بین 70.8 تا 78.2درصد، برای تشخیص گر مبتنی بر امضای رفتاری، 98 درصد و برای تشخیص گر مبتنی بر یادگیری ماشین با استفاده از الگوریتم یادگیری جنگل تصادفی 99 درصد نشان می دهد. همچنین نتایج آزمایش ها نشان داد که کاشف با تشخیص 94 درصدی بدافزارهای محافظت شده، بهبود دو درصدی در مقایسه با نتایج 10 محصول مشابه دارد. و با تشخیص 98 درصدی بدافزارهای محافظت نشده، بهبود پنج درصدی در مقایسه با نتایج 10 محصول مشابه دارد.

کلید واژگان: بدافزار، فایل اجرایی، تشخیص بدافزار، امضای رفتاری، الگوریتم جنگل تصادفی

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Kashef: A Two-step detector of Windows-based Malicious executable files

Ehsan Allah Shaghaghi *, Reza Jalayi, Mohammadali Javadzadeh

Journal of Electronic and Cyber Defense, Volume:10 Issue: 2, 2022, PP 141 -154

The growing number of malware is one of the major threats in the field of cyber and its detection has always been associated with challenges. Windows-based malicious executable files perform malicious activity at the target operating system level or any other application by manipulating features in their header and obscuring their behavior. Detecting suspicious specimens from a large volume of input samples as well as discovering new and unknown malware is one of the topics that is always researched by researchers. In this study, a combined method has been proposed to determine the level of maliciousness of suspicious executable files. Kashif's proposed method consists of two static modules for extracting executable file header properties, and two behavioral modules for extracting signature-generating properties and a thoughtful behavioral model based on machine learning methods. The purpose of this study is to identify suspicious Windows executable files from the large volume of files and determine their maliciousness level. This method detects malware based on the maliciousness probability of being assigned to each file. Experiments showed a malignancy percentage of six types of malware for PE header detector module, in the range of 62.7 to 70%, Yara-based detector module, in the range of 70.8 to 78.2%, Behavioral signature-based detector module, 98% and ML-based detector module by using Random forest learning algorithm has been 99% accuracy. The experimental results also showed that Kashef detected 94% of the protected malware with a 2% improvement compared to the results of 10 similar products. And with 98% detection of unprotected malware, there is a 5% improvement compared to the results of 10 similar products.

Keywords: Malware, Executable file, Malware Detection, Behavioral signature, Random Forest Algorithm

Abstract View Paper Research/Original Article Original: Persian
بهینه سازی تشخیص حملات تزریق SQL با استفاده ترکیبی از الگوریتم های جنگل تصادفی و ژنتیک

جواد مرادی*، مجید غیوری ثالث

نشریه فرماندهی و کنترل، سال پنجم شماره 1 (پیاپی 15، بهار 1400)، صص 87 -98

علی رغم تمام تلاش متخصصان امنیتی برای کشف حملات تزریق SQL، اما بر اساس گزارش OWASP، کماکان حمله تزریق SQL به عنوان مهم ترین و زیان بارترین حمله سایبری توسط مهاجمین مورد استفاده قرار می گیرد. به منظور تشخیص حملات از دو روش مبتنی بر امضاء و مبتنی بر رفتار استفاده می شود. روش های مبتنی بر امضاء برای حملات شناخته شده کاربرد دارند و روش های مبتنی بر رفتار برای تشخیص حملات ناشناخته مناسب هستند. از آنجایی که حملات به روش های مختلفی پیاده سازی می شوند سیستم های تشخیص نفوذ مبتنی بر رفتار، کاربرد بیشتری دارند. رفتار را می توان با استفاده از روش هایی مانند طبقه بندی، خوشه بندی و غیره تحلیل کرد. یکی از مهم ترین الگوریتم های طبقه بندی، الگوریتم جنگل تصادفی است که دقت بالایی دارد و از طرفی پیاده سازی و تفسیر نتایج با استفاده از این الگوریتم به سادگی قابل انجام است. با توجه به بررسی های انجام شده دقت الگوریتم جنگل تصادفی به شدت وابسته به پارامترهای ورودی آن است. این پارامترها شامل 9 مورد ازجمله تعداد درخت ها، عمق آن ها، نحوه رای گیری، بهره اطلاعاتی و غیره است. تعیین بهینه این پارامترها یک مسئله بهینه سازی با فضای حالت بزرگ است. در این پژوهش روشی بر اساس الگوریتم ژنتیک برای تعیین مقادیر بهینه این پارامترها ارایه شده است. در اثر تعیین بهینه پارامترها، نتایج به دست آمده در مقایسه با حالت پیش فرض الگوریتم و سایر تحقیقات، بهبود دقت تشخیص را نشان می دهد. نتایج ارزیابی حاکی از آن است که دقت تشخیص نفوذ در روش پیشنهادی، 98% بوده است که در مقایسه با الگوریتم جنگل تصادفی با پارامترهای پیش فرض حدودا 11% و در مقایسه با پژوهش های قبلی 08% دقت تشخیص، افزایش یافته است.

کلید واژگان: الگوریتم جنگل تصادفی، الگوریتم ژنتیک، حمله تزریق SQL، سیستم تشخیص نفوذ پایگاه داده

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Optimizing the detection of SQL injection attacks using a combination of random forest and genetic algorithms

Javad Moradi*, Majid Ghayoori

Journal of Command and Control Communications Computer Intelligence, Volume:5 Issue: 1, 2022, PP 87 -98

Despite all the efforts of security experts to detect SQL injection attacks, according to OWASP report’s, SQL injection attack is still used as the most important cyber attack by attackers. In order to detect attacks, two methods are used: signature-based and behavior-based. Signature-based methods are used for known attacks, and behavior-based methods are suitable for detecting unknown attacks. Behavior-based intrusion detection systems are more useful because attacks are implemented in different ways. Behavior can be analyzed by methods such as classification, clustering, etc. One of the most important classification algorithms is the random forest algorithm which has high accuracy and on the other hand the implementation and interpretation of the results can be done easily using this algorithm. According to the studies, the accuracy of the random forest algorithm is highly dependent on its input parameters. These parameters include 9 items, including the number of trees, their depth, voting method, information gain, and so on. Optimal determination of these parameters is an optimization problem with large state space. In this research, a method based on genetic algorithm to determine the optimal values of these parameters is presented. Due to the optimal determination of the parameters, the obtained results show an improvement in the detection accuracy compared to the default state of the algorithm and other researches. The evaluation results indicate that the intrusion detection accuracy in the proposed method was %98, which is about %11 higher than the random forest algorithm with default parameters and %08 higher than previous studies.

Keywords: Random forest algorithm, Genetic algorithm, SQL injection attack, Database intrusion detection system

Abstract View Paper Research/Original Article Original: Persian
شناسایی عوامل موثر در مصرف انرژی خانگی به کمک روش های داده کاوی

ریحانه سادات حافظی فرد، جمال زارع پور احمدآبادی*، الهام عباسی هرفته

نشریه انرژی ایران، سال بیست و سوم شماره 1 (پیاپی 86، بهار 1399)، صص 25 -45

باتوجه به افزایش جمعیت و اینکه منابع انرژی رو به کاهش است، در این تحقیق به مطالعه انرژی مصرفی خانگی پرداخته شده است. هدف از این پژوهش پیش بینی عوامل موثر بر انرژی مصرفی خانگی می باشد. برای این پیش بینی از سه الگوریتم قواعدM5 ، نزدیک ترین همسایه و جنگل تصادفی استفاده شده است که در نرم افزار weka موجود می باشد. در این پژوهش از الگوریتم ارزیابی همبستگی ویژگی ها برای انتخاب بهترین عوامل نیز استفاده شده است. این الگوریتم مهمترین عوامل موثر بر انرژی مصرفی و میزان تاثیر آنها را مشخص می کند. نتایج حاصل از این بررسی نشان می دهد که چراغ ها و وسایل روشنایی، درجه حرارت و دما در اتاق نشیمن، درجه حرارت و دما در خارج از ساختمان، درجه حرارت و دما در خارج از ایستگاه هواشناسی چیورس، سرعت وزیدن باد، رطوبت در منطقه آشپزخانه و درجه حرارت و دما در محل لباسشویی بیشترین تاثیر را در مصرف انرژی خانگی دارد. همچنین از بین الگوریتم های آزموده شده، جنگل تصادفی بهترین نتیجه را به دست می دهد.

کلید واژگان: انرژی مصرفی خانگی، الگوریتم M5Rules، الگوریتم نزدیکترین همسایه، الگوریتم جنگل تصادفی، ارزیابی همبستگی ویژگی ها، وسایل روشنایی، دما، ایستگاه هواشناسی چیورس

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Identifying Factors Affecting Household Energy Consumption Using Data Mining Methods

Reyhane Sadat Hafezifard, Jamal Zarepour-Ahmadabadi*, Elham Abbasi

Iranina journal of Energy, Volume:23 Issue: 1, 2020, PP 25 -45

Due to increasing population and decreasing energy sources, this research studies the consumption of domestic energy. The purpose of this study is to predict the factors affecting household energy consumption. To do this, we use 3 algorithms, M5Rules, K-nearest neighbor and random forest, available in Weka software. In this study, the feature correlation algorithm is used to select the most important factors affecting energy consumption and their impact. The results show that lights and fixtures, temperature of the living room, outside temperature, temperature outside of Chievres Station, wind speed, humidity in the kitchen and the temperature in the laundry area have the most impact on household energy consumption. Among the methods, random forest algorithm presented the best results.

Keywords: Household Energy Consumption, M5Rules Algorithm, K-NN, Random Forest Algorithm, Correlation Evaluation of Properties, Lighting Devices, Temperature, Chievers Weather Station

Abstract View Paper Research/Original Article Original: Persian
ارائه روشی جدید جهت بهبود تشخیص نفوذ با استفاده از ترکیب الگوریتم جنگل تصادفی و الگوریتم ژنتیک

سید جواد کاظمی تبار*، ریحانه طاهری امیری، قربان خردمندیان

مجله علوم و فناوری های پدافند نوین، سال دهم شماره 3 (پیاپی 37، پاییز 1398)، صص 287 -296

همگام با گسترش شبکه های کامپیوتری، حملات و نفوذها به این شبکه ها نیز افزایش یافته است. برای داشتن امنیت کامل در یک سامانه کامپیوتری، علاوه بر فایروال ها و دیگر تجهیزات جلوگیری از نفوذ، سامانه های دیگری به نام سامانه های تشخیص نفوذ (IDS) مورد نیاز هستند. هدف از یک سامانه تشخیص نفوذ نظارت بر فعالیت های غیرعادی و افتراق بین رفتارهای طبیعی و غیرطبیعی (نفوذ) در یک سامانه میزبان و یا در یک شبکه است. یک سامانه تشخیص نفوذ را زمانی می توان کارا دانست که نرخ تشخیص نفوذ بالا و به صورت هم زمان نرخ هشدار اشتباه کمی را دارا باشد. در این مقاله روشی جدید جهت طبقه بندی مجموعه داده KDD-Cup-99 معرفی شده است که از ترکیب الگوریتم جنگل تصادفی و الگوریتم ژنتیک حاصل شده است و هدف آن افزایش سرعت فاز یادگیری و آزمون و همچنین دقت روش جنگل تصادفی است. از جنگل تصادفی به دلیل ساختار ساده و کارایی بالای آن در بسیاری از محصولات مبتنی بر یادگیری ماشین استفاده می شود. ولی مانند دیگر الگوریتم های مبتنی بر درخت تصمیم، وجود تعداد زیادی متغیر غیرعددی (نوعی) می تواند برای دقت و سرعت برنامه مشکل ایجاد کند. در مسئله تشخیص نفوذ دقیقا ما با چنین سناریویی مواجه هستیم. نوآوری این مقاله، حل این معضل با استفاده از الگوریتم ژنتیک است. در این مقاله با تعریف کردن معیاری با نام بهره اطلاعات، تعداد ویژگی ها کاهش یافته است.

کلید واژگان: تشخیص نفوذ مبتنی بر الگوی رفتاری، داده کاوی، الگوریتم ژنتیک، الگوریتم جنگل تصادفی

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

A Novel Technique for Improvement of Intrusion Detection via Combining Random Forrest and Genetic Algorithm

Seyyed Javad Kazemitabar *, Reyhaneh Taheri, Ghorban Kheradmandian

Journal of Passive Defence Science and Technology, Volume:10 Issue: 3, 2019, PP 287 -296

As computer networks grow, so attacks and intrusions to these networks are increased. In order to have a fully secure computer network, one needs ‘intrusion detection systems’ (IDS) on top of firewalls. The goal of using an IDS is to supervise the abnormal activities and differentiate between normal and abnormal activities in a host system or in a network. An efficient IDS has high detection rate while keeping a low false alarm rate. In this paper, a new approach to classify KDD-Cup-99 data set using a combination of random forest method and genetic algorithm is presented. The purpose is to increase the speed of learning and test phases while improving the accuracy. Random forest is an ensemble learning method based on decision trees. Due to its relatively simple structure and good performance, it is used in many supervised learning applications. However, like all tree based machine learning algorithms, having too many categorical features, can be a problem both for the speed and accuracy. This is exactly the case with the problem in hand, i.e. intrusion detection; many of the features are in the form of categorical data. For example, in R language, the maximum number of definable categorical features for random forest is 53. The contribution of this work is resolving this issue with the aid of Genetic Algorithm (GA). In this research information gain as a measure of importance is defined and the number of features is reduced based on genetic algorithm.

Keywords: Signature-Based Intrusion Detection, Data Mining, genetic algorithm, Random Forest Algorithm

Abstract View Paper Research/Original Article Original: Persian
مکان یابی خطای اتصال کوتاه در خطوط انتقال جریان مستقیم ولتاژ بالا با استفاده از شبکه عصبی رگرسیون تعمیم یافته و الگوریتم جنگل تصادفی

محمد فرشاد، جواد ساده

نشریه هوش محاسباتی در مهندسی برق، سال چهارم شماره 2 (تابستان 1392)، صص 1 -14

این مقاله روشی مبتنی بر استراتژی های یادگیری ماشین برای حل مسئله مکان یابی خطا در خطوط انتقال جریان مستقیم ولتاژ بالا (HVDC) ارائه می دهد. در روش مکان یابی پیشنهادی، تنها از سیگنال ولتاژ پس از خطای اندازه گیری شده از یک پایانه برای استخراج ویژگی های موردنیاز بهره گیری می شود. در این مقاله، متناسب با بعد بالای بردار ویژگی های ورودی، امکان استفاده از دو تخمین گر متفاوت شامل شبکه عصبی رگرسیون تعمیم یافته (GRNN) و الگوریتم جنگل تصادفی (RF) برای یافتن رابطه موجود بین ویژگی های الگوها و مکان وقوع خطا مورد بررسی قرار می گیرد. نتایج ارزیابی با استفاده از الگوهای یادگیری و تست بدست آمده از شبیه سازی انواع خطاها در یک خط انتقال هوایی بلند و بر اساس مقادیر مختلف محل وقوع خطا، مقاومت خطا و جریان پیش از خطا، نشان دهنده کارآیی و دقت قابل قبول روش پیشنهادی می باشند.

کلید واژگان: مکان یابی خطا، خطوط انتقال HVDC، شبکه عصبی رگرسیون تعمیم یافته، الگوریتم جنگل تصادفی

چکیده مشاهده متن زبان: فارسی

Fault Locating in HVDC Transmission Lines Using Generalized Regression Neural Network and Random Forest Algorithm

M. Farshad, J. Sadeh

Intelligent Systems in Electrical Engineering, Volume:4 Issue: 2, 2013, PP 1 -14

This paper presents a novel method based on machine learning strategies for fault locating in high voltage direct current (HVDC) transmission lines. In the proposed fault-location method، only post-fault voltage signals measured at one terminal are used for feature extraction. In this paper، due to high dimension of input feature vectors، two different estimators including the generalized regression neural network (GRNN) and the random forest (RF) algorithm are examined to find the relation between the features and the fault location. The results of evaluation using training and test patterns obtained by simulating various fault types in a long overhead transmission line with different fault locations، fault resistance and pre-fault current values have indicated the efficiency and the acceptable accuracy of the proposed approach.

Keywords: Fault Location, HVDC Transmission Lines, Generalized Regression Neural Network, Random Forest Algorithm

Abstract View Paper Original: Persian

نکته

نتایج بر اساس تاریخ انتشار مرتب شده‌اند.
کلیدواژه مورد نظر شما تنها در فیلد کلیدواژگان مقالات جستجو شده‌است. به منظور حذف نتایج غیر مرتبط، جستجو تنها در مقالات مجلاتی انجام شده که با مجله ماخذ هم موضوع هستند.
در صورتی که می‌خواهید جستجو را در همه موضوعات و با شرایط دیگر تکرار کنید به صفحه جستجوی پیشرفته مجلات مراجعه کنید.

به جمع مشترکان مگیران بپیوندید!

random forest algorithm