جستجوی مقالات مرتبط با کلیدواژه

machine learning algorithms

در نشریات گروه علوم پایه

تکرار جستجوی کلیدواژه machine learning algorithms در مقالات مجلات علمی

انتخاب همه

Mitigating data imbalance for enhanced third-party insurance claim prediction using machine ‎learning

Maryam Esna-Ashari *, Hamideh Badi, Majid Chahkandi, ‎Hamid Saadatfar

Journal of mathematic and modeling in Finance, Volume:5 Issue: 1, Winter-Spring 2025, PP 175 -187

Accurate prediction of third-party insurance claims is critical for pricing policies and managing risk. However, the highly imbalanced nature of insurance data—where non-claim cases vastly outnumber claim cases—poses significant challenges to standard predictive models. This study explores the use of machine learning algorithms to enhance claim prediction by directly addressing this imbalance. We use real data from the Insurance Research Center of Iran, incorporating variables such as driver characteristics, vehicle features, location, and claims history. Five models are evaluated: logistic regression, decision tree, bagging, random forest, and boosting. To handle the imbalance, we apply random undersampling, oversampling, and SMOTE. Model performance is assessed using accuracy, sensitivity, specificity, precision, and F-score. Results indicate that when data imbalance is properly treated, ensemble methods—particularly decision trees, bagging, and random forest—significantly outperform logistic regression and boosting, especially in detecting actual claim cases. The study underscores the importance of using appropriate resampling techniques and evaluation metrics in imbalanced settings. These findings can help insurers develop more reliable models for pricing and risk classification‏.

Keywords: Machine Learning Algorithms‎, ‎Third-Party Insurance‎, ‎Imbalanced Data

Abstract View Paper Research/Original Article Original: English
Missing data imputation using supervised learning methods

Behzad Rezaei Shiri, Samaneh Eftekhari Mahabadi∗

Journal of Statistical Modelling: Theory and Applications, Volume:2 Issue: 2, Summer and Autumn 2021, PP 103 -112

Missing data is a very common problem in all research fields. Case deletion is a simple way to handle incomplete data sets which could mislead to biased statistical results. A more reliable approach to handle missing values is imputation which allows covariate-dependent missing mechanism, as well. This paper aims to prepare guidance for researchers facing missing data problems by comparing various imputation methods including machine learning techniques, to achieve better results in supervised learning tasks. A benchmark dataset has experimented and the results are compared by applying popular classifiers over varying missing mechanisms and rates on this benchmark dataset.

Keywords: Imputation, Machine learning algorithms, Missing data, Missing mechanism

Abstract View Paper Research/Original Article Original: English
Extraction of coastlines from satellite images using sub-pixel algorithms

Alireza Tilkoo*, Seyed Mostafa Siadatmousavi, Barat Mojaradi

Journal of the Persian Gulf (Marine Science), Volume:9 Issue: 34, Winter 2018, PP 55 -63

Coastal environments are always under the pressure of natural processes such as erosion, sedimentation, natural disasters as well as human projects. These threats have made coastal areas a priority for coastline monitoring and sustainable coastal management programs. In this paper, algorithms for separating water and land boundaries as well as new sub-pixel methods are presented with the aim of dividing large pixels (with low resolution and spatial accuracy) into smaller pixels and creating a classified map with better spatial resolution. Different water identification indices and machine learning algorithms were investigated, and two models of Spatial Attraction Models were implemented. Results showed that the Sub-pixel / Sub-pixel Spatial Attraction Model had more capacity in providing higher resolution and precision, while provided 10% reduction in error when compared with observations. To skill assess these two methods, the difference in areas created by each method compared to the reference shoreline (high resolution aerial image) was computed. Also, in order to accurately evaluate and show the high accuracy of sub-pixel algorithms, the results of these algorithms should be examined by conventional classification methods. The creation of such models is proposed to support integrated coastal management in the Persian Gulf region for future studies.

Keywords: subpixel algorithms, shoreline, spatial attraction model, machine learning algorithms

Abstract View Paper Research/Original Article Original: English
Missing data imputation using supervised learning methods

Behzad Rezaei Shiri, Samaneh Eftekhari Mahabadi *

Journal of Statistical Modelling: Theory and Applications, Volume:2 Issue: 1, Winter and Spring 2021, PP 181 -190

Missing data is a very common problem in all research fields. Case deletion is a simple way to handle incomplete data sets which could mislead to biased statistical results. A more reliable approach to handle missing values is imputation which allows covariate-dependent missing mechanism, as well. This paper aims to prepare guidance for researchers facing missing data problems by comparing various imputation methods including machine learning techniques, to achieve better results in supervised learning tasks. A benchmark dataset has experimented and the results are compared by applying popular classifiers over varying missing mechanisms and rates on this benchmark dataset.

Keywords: Imputation, Machine learning algorithms, Missing data, Missing mechanism

Abstract View Paper Research/Original Article Original: English
مرزبندی زون های دگرسانی پتاسیک و فیلیک بر اساس نتایج حاصل از مدل سازی سه بعدی داده های سیالات درگیر به روش شبکه های عصبی مصنوعی

ملیحه عباس زاده*، اردشیر هزارخانی، سعید سلطانی محمدی

نشریه علوم زمین، پیاپی 113 (پاییز 1398)، صص 115 -122

امروزه یکی از روش های متداول در اکتشاف کانسارها، مطالعات زمین شناسی اقتصادی است. مدل سازی داده های میانبارهای سیال یکی از روش های متداول در مطالعات زمین شناسی اقتصادی به شمار می رود. در این مطالعه از روش شبکه های عصبی مصنوعی به عنوان یکی از روش های الگوریتم یادگیری ماشین به منظور مدل سازی سه بعدی داده های میانبارهای سیال در کانسار مس پورفیری سونگون و کاربردی کردن نتایج حاصل از آنالیز میانبارهای سیال استفاده شده است. به این منظور داده های حاصل از مطالعات میانبارهای سیال مستقیما جهت تفکیک زون های دگرسانی مرتبط با کانی زایی (پتاسیک، فیلیک و پتاسیک- فیلیک) در منطقه مورد مطالعه استفاده شده است. با توجه به ارتباطی که بین زون های دگرسانی و نیز مناطق مستعد کانی سازی در کانسارهای پورفیری وجود دارد، بر اساس 173 داده میانبارهای سیال موجود، تفکیک زون های دگرسانی در محدوده کانسار مس پورفیری سونگون بر اساس مدل سه بعدی حاصل از مطالعات میانبارهای سیال با استفاده از روش شبکه های عصبی مصنوعی صورت گرفت. بر اساس دقت نتایج حاصل از آزمایش مدل، می توان نتیجه گرفت که دقت مدل شبکه عصبی به کار گرفته شده در تفکیک زون های دگرسانی پتاسیک، فیلیک و پتاسیک- فیلیک در حدود 83 درصد بوده و مدل به کار گرفته شده به نحو مناسبی توانایی تفکیک زون های دگرسانی مرتبط با کانی سازی را در محدوده کانسار مس پورفیری سونگون داشته است. امروزه یکی از روش های متداول در اکتشاف کانسارها، مطالعات زمین شناسی اقتصادی است. مدل سازی داده های میانبارهای سیال یکی از روش های متداول در مطالعات زمین شناسی اقتصادی به شمار می رود. در این مطالعه از روش شبکه های عصبی مصنوعی به عنوان یکی از روش های الگوریتم یادگیری ماشین به منظور مدل سازی سه بعدی داده های میانبارهای سیال در کانسار مس پورفیری سونگون و کاربردی کردن نتایج حاصل از آنالیز میانبارهای سیال استفاده شده است. به این منظور داده های حاصل از مطالعات میانبارهای سیال مستقیما جهت تفکیک زون های دگرسانی مرتبط با کانی زایی (پتاسیک، فیلیک و پتاسیک- فیلیک) در منطقه مورد مطالعه استفاده شده است. با توجه به ارتباطی که بین زون های دگرسانی و نیز مناطق مستعد کانی سازی در کانسارهای پورفیری وجود دارد، بر اساس 173 داده میانبارهای سیال موجود، تفکیک زون های دگرسانی در محدوده کانسار مس پورفیری سونگون بر اساس مدل سه بعدی حاصل از مطالعات میانبارهای سیال با استفاده از روش شبکه های عصبی مصنوعی صورت گرفت. بر اساس دقت نتایج حاصل از آزمایش مدل، می توان نتیجه گرفت که دقت مدل شبکه عصبی به کار گرفته شده در تفکیک زون های دگرسانی پتاسیک، فیلیک و پتاسیک- فیلیک در حدود 83 درصد بوده و مدل به کار گرفته شده به نحو مناسبی توانایی تفکیک زون های دگرسانی مرتبط با کانی سازی را در محدوده کانسار مس پورفیری سونگون داشته است.

کلید واژگان: میانبارهای سیال، الگوریتم یادگیری ماشین، روش شبکه های عصبی مصنوعی، زون های دگرسانی، کانسار مس پورفیری سونگون

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Potassic and Phyllic Alteration Zoning Based on the Results of 3D Modeling of Fluid Inclusion Data by Artificial Neural Networks

Maliheh Abbaszadeh *, Ardeshir Hezarkhani, Saeed Soltani Mohammadi

Geosciences Scientific Quarterly Journal, Volume:29 Issue: 113, 2019, PP 115 -122

In recent years, economic geology studies have become very popular method in mineral exploration studies. Modeling fluid inclusion data is one of the common studies in economic geology. In this research artificial neural networks method, as one of the machine learning algorithms, is used for three-dimensional modeling and application of the results of fluid inclusion analysis in Sungun porphyry copper deposit. For this purpose, fluid inclusion data is used for directly separation of related alteration zones with mineralization (Potassic, Phyllic and Potassic- Phyllic). Due to the relation that exists between alteration zones and mineralization areas, based on 173 fluid inclusion data the separation of alteration zones is modeled by artificial neural networks method in Sungun porphyry copper deposit. According to the validation studies, it can be concluded that precision of this model is appropriate (83%) and trained model could be used for separation of alteration zones in Sungun porphyry copper deposit.

Keywords: Fluid inclusion, Machine Learning Algorithms, Artificial Neural Networks Method, Alteration Zones, Sungun Porphyry Copper Deposit

Abstract View Paper Research/Original Article Original: Persian
مقایسه الگوریتم های برپایه یادگیری ماشین بر دقت تخمین داده های گمشده حاصل از آزمایش های ریزآرایه

مریم مشیری، مصطفی قادری زفره ای*، فرزان قانع گلمحمدی

مجله پژوهش های سلولی مولکولی (زیست شناسی ایران)، سال بیست و هشتم شماره 4 (زمستان 1394)، صص 612 -622

وجود داده های گمشده در داده های ریزآرایه، سبب کاهش دقت رسم شبکه های تنظیمی ژن، ایجاد اشتباه در خوشه بندی و تقسیم بندی تخصصی ژن ها و سایر تحلیل ها می شود. بنابراین تخمین داده-های گمشده مرحله مهمی در پیش پردازش داده های ریزآرایه، محسوب می شود. عملکرد الگوریتم-های تخمین در مجموعه داده های مختلف و با درصدهای متفاوت گمشدگی، متغیر است. همواره انتخاب مناسب ترین الگوریتم به منظور دستیابی به بیشترین دقت در محاسبات داده های گمشده از اهمیت خاصی برخوردار است. در این مطالعه از سه مجموعه داده آزمایش های ریزآرایه استفاده شد. پس از مشخص کردن ابعاد ماتریس بیانی و نرمال کردن داده ها، درصدهای مختلفی از گمشدگی، بر مجموعه داده های مورد مطالعه اعمال شد. سپس نتایج حاصل با استفاده از 11 الگوریتم بر پایه یادگیری ماشین، به منظور بررسی میزان دقت هر یک از الگوریتم ها در تعیین میزان درصد گمشدگی، مورد مقایسه قرار گرفت. بر اساس نتایج، دقت الگوریتم های مختلف به مجموعه داده به کار رفته، درصد گمشدگی و توزیع گمشدگی داده ها وابسته است. همچنین تعداد نمونه های آزمایشی موجود در مجموعه داده ها نیز می تواند بر دقت الگوریتم های تخمین داده های گمشده موثر باشد. نتایج بیانگر کاهش دقت تمامی الگوریتم ها با افزایش درصد داده های گمشده بود، اما الگوریتم های Least Square Adaptive و Local least square دقت بیشتری در مقابل افزایش درصد گمشدگی داده ها نسبت به سایر الگوریتم ها نشان دادند.

کلید واژگان: الگوریتم های بر پایه یادگیری ماشین، تخمین داده های گمشده، ریزآرایه

چکیده مشاهده متن زبان: فارسی

Comparison of machine learning algorithms on missing values estimation accuracy of microarray datasets

Maryam Moshiri, Mostafa Ghaderi *, Farzan Ghanegoolmohamadi

Journal of Molecular and Cellular Research, Volume:28 Issue: 4, 2016, PP 612 -622

Existence of missing values in DNA microarray data would decrease the accuracy of regulatory gene networks construction and may cause mistake in clustering and classifying gene expression for downstream analysis. Therefore, missing value imputation is a pivotal step in preprocessing of DNA microarray data. Selection a proper algorithm for achieving the most accurate conclusions in missing values imputation remains to be quite compelling. In this study, three microarray datasets were used to compare the performance of different machine learning algorithms in imputing DNA microarray missing values. In this way, after determining the dimensions of matrix of expression data and normalizing the data, different missing percentages were applied on each datasets. By running 11 machine learning algorithms on these datasets, the accuracy of each algorithm under different conditions were measured. Based on the results, the accuracy of different algorithms depended on missing value percentages and its distribution in the dataset. Also, the number of experimental samples in the datasets affected the accuracy of missing values imputation algorithms. The results showed a decreasing trend in accuracy by increasing the percentage of missing data in the dataset. In general, Least Square Adaptive and Local Least Square algorithms shown to be more robust in terms of accuracy when the level of missing values percentage increased in the dataset. Therefore, we would suggest these algorithms could be considered in working out sound missing values imputation in DNA microarray data.

Keywords: Machine Learning Algorithms, Missing value estimation, DNA Microarray

Abstract View Paper Original: Persian

نکته

نتایج بر اساس تاریخ انتشار مرتب شده‌اند.
کلیدواژه مورد نظر شما تنها در فیلد کلیدواژگان مقالات جستجو شده‌است. به منظور حذف نتایج غیر مرتبط، جستجو تنها در مقالات مجلاتی انجام شده که با مجله ماخذ هم موضوع هستند.
در صورتی که می‌خواهید جستجو را در همه موضوعات و با شرایط دیگر تکرار کنید به صفحه جستجوی پیشرفته مجلات مراجعه کنید.

به جمع مشترکان مگیران بپیوندید!

machine learning algorithms