جستجوی مقالات مرتبط با کلیدواژه

q-learning algorithm

در نشریات گروه فنی و مهندسی

تکرار جستجوی کلیدواژه q-learning algorithm در مقالات مجلات علمی

انتخاب همه

مسیریابی برای شبکه ای از پهپادها با هدف جست وجو و نجات

عاطفه واثی*، طاها بازوند، محسن نیک رای

نشریه صنعت و دانشگاه، سال پانزدهم شماره 57 (پاییز و زمستان 1401)، صص 87 -106

مسیریابی شبکه ای پهپادها برای عملیات جست وجو و نجات یک چالش مهم است. این چالش به دلیل محدودیت های فیزیکی پهپادها، شرایط محیطی نامساعد، و محدودیت های زمانی است. در این مقاله، یک روش جدید برای مسیریابی شبکه ای پهپادها با استفاده از الگوریتم Q-Learning ارائه شده است. این الگوریتم به پهپادها امکان می دهد تا به صورت خودکار بهترین مسیرها را در محیط های پیچیده ترسیم کنند و همچنین با تغییرات محیطی سازگار شوند. نتایج شبیه سازی های انجام شده نشان می دهد که الگوریتم Q-Learning می تواند مسیرهای کوتاه تر و کارآمدتری را نسبت به الگوریتم های حوزه ژنتیک پیدا کند. این نتایج نشان می دهد که الگوریتم Q-Learning می تواند یک روش امیدوارکننده برای بهبود مسیریابی شبکه ای پهپادها در عملیات جست وجو و نجات باشد.

کلید واژگان: بهینه سازی، الگوریتم ژنتیک، الگوریتم Q-Learning، مسیریابی پهپادها، شبکه ای از پهپادها

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Routing for a Network of Drones with the Aim of Search and Rescue

Atefeh Vasi *, Taha Bazvand, Mohsen Nickray

Journal of Industry and University, Volume:15 Issue: 57, 2024, PP 87 -106

Network routing of drones for search and rescue operations is a critical challenge. This challenge arises due to the physical limitations of drones, adverse environmental conditions, and time constraints. In this paper, a novel approach for network routing of drones using the Q-Learning algorithm is proposed. This algorithm enables drones to automatically determine optimal paths in complex environments and adapt to environmental changes. Simulation results demonstrate that the Q-Learning algorithm can find shorter and more efficient routes compared to genetic algorithms. These findings highlight Q-Learning as a promising method for improving network routing of drones in search and rescue operations

Keywords: Drone Routing, Genetic Algorithm, Q-Learning Algorithm, Network Of Drones, Optimization

Abstract View Paper Research/Original Article Original: Persian
تعیین دوز بهینه دارو برای کنترل جعیت سلول های سرطانی با لحاظ اثرات زیان بار دارو در بیمار مبتلا به ملانوما با استفاده از روش مسیرهای شایستگی

الناز کلهر، امین نوری*، سارا صبوری راد، محمدعلی صدرنیا

مجله رایانش نرم و فناوری اطلاعات، سال دهم شماره 1 (بهار 1400)، صص 72 -92

هدف اصلی در این مقاله، تعیین میزان بهینه دوز دارو برای کاهش جمعیت سلول های سرطانی در بیماران مبتلا به سرطان ملانوما می باشد. برای این کار از روش مسیرهای شایستگی که یکی از روش های حل مسئله یادگیری تقویتی می باشد، استفاده شده است. این روش مزایای دو روش مرسوم یادگیری تقویتی شامل یادگیری تفاوت گذرا و مونت کارلو را دارا می باشد. از دیگر مزایای این روش می توان به بی نیاز بودن آن به مدل ریاضی اشاره کرد ولی چون امکان پیاده سازی بر روی سیستم واقعی امکان پذیر نبوده است، برای بررسی عملکرد کنترلر پیشنهادی از مدل ریاضی غیرخطی تاخیردار جهت شبیه سازی رفتار محیط استفاده گردیده است. با توجه به بررسی هایی که تاکنون انجام شده است،لازم به ذکر می باشد که بر روی این مدل ریاضی هیچ نوع روش کنترلی پیاده سازی نشده است و این اولین باری می باشد که کنترل جمعیت سلول های سرطانی برای این مدل انجام گرفته است. در کنترل بهینه دوز دارو، میزان دارو می بایست به گونه ای باشد تا از اثرات زیان بار دارو بر روی سلول های سالم تا حد امکان جلوگیری شود. با توجه به نتایج حاصل از شبیه سازی، مشاهده می شود که روش انتخابی توانسته است با تزریق زیر بهینه میزان دوز دارو، جمعیت سلول های سرطانی را کنترل کرده، کاهش داده و به صفر برساند که این امر، در کنار افزایش سلول های ایمنی بدن رخ داده است. در انتها برای نشان دادن مزیت روش انتخابی در افزایش سرعت برای کاهش سلول های سرطانی، این روش با روش الگوریتم یادگیری Q که یکی دیگر از روش های حل مسئله یادگیری تقویتی می باشد و روش کنترل بهینه مقایسه شده است. با اعمال عیب به سنسور سیستم نیز، عملکرد کنترلر پیشنهادی برای کاهش سلول های سرطانی در حضور عیب مورد بررسی قرار گرفت. برای بررسی یکی از مزایای روش یادگیری تقویتی که تطبیق پذیری آن با محیط می باشد، با لحاظ عدم قطعیت در پارامترهای سیستم و شرایط اولیه، کنترل جمعیت سلول های سرطانی در پنج بیمار مبتلا به سرطان ملانوما انجام شده است. همچنین سرعت همگرایی هر دو روش مسیرهای شایستگی و الگوریتم یادگیری Q در کاهش سلول های سرطانی به ازای نرخ های آموزش مختلف مورد بررسی قرار گرفته است.

کلید واژگان: اثرات زیان بار دارو، الگوریتم یادگیری Q، کنترل جمعیت سلول های سرطانی، ملانوما، یادگیری تقویتی، مسیرهای شایستگی، کنترل بهینه

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Using Eligibility Traces Algorithm to Specify the Optimal Dosage for the Purpose of Cancer Cell Population Control in Melanoma Patients with a Consideration of the Side Effects

Elnaz Kalhor, Amin Noori *, Sara Saboori Rad, Mohammad Ali Sadrnia

Journal of Soft Computing and Information Technology, Volume:10 Issue: 1, 2021, PP 72 -92

This paper mainly aims to determine the optimal drug dosage for the purpose of reducing the population of cancer cells in melanoma patients. To do so, Reinforcement Learning method and the eligibility traces algorithm are employed, giving us the advantage of creating a compromise between the two algorithms of the reinforcement learning, being Monte-Carlo and Temporal Difference. Furthermore, it can be said that using this approach, there was no need to employ a mathematical model in the whole process. However, as its implementation on the real system was not possible, a delayed nonlinear mathematical model is used to investigate the performance of the proposed controller and simulate the behavior of the environment. It should be noted this mathematical model made use of no control method. This is the first time that population control of cancer cells is applied and tested on this model. To know of the optimal dosage of the drug, it should be mentioned that the drug is required to prevent the side effects on healthy/normal cells as much as possible. According to the obtained results, the eligibility traces algorithm is able to control and reduce the population of cancer cells through injecting the sub-optimal drug dose. This will increase the level of immunity in our body. Finally, to demonstrate the advantage of a selective method of increasing the rate of cancer cell death, this method is compared with the Q-learning algorithm and optimal control. By applying the fault to the sensor, the performance of the proposed controller to reduce cancer cells was investigated. The adaptability of the proposed method with the environment changes is checked afterwards. To this end, uncertainty in the system parameters and initial conditions are applied and the population of cancer cells are controlled in five melanoma patients. Moreover, having added noise to the system, it was shown that the eligibility traces algorithm is able to control the population of cancer cells and make it reach zero. Additionally, the convergence speed of both eligibility traces algorithm and Q learning algorithm in reducing the number of cancer cells for different learning rates was investigated.

Keywords: Side effects of drugs, Q-learning algorithm, cancer cells population control, Melanoma, Reinforcement Learning, Eligibility Traces, Optimal control method

Abstract View Paper Research/Original Article Original: Persian
کنترل جمعیت سلول های سرطانی در مدل غیرخطی سرطان ملانوما با لحاظ عدم قطعیت با استفاده از الگوریتم یادگیری Q تحت سیاست استدلال مبتنی بر مورد (CBR)

امین نوری*، الناز کلهر، محمدعلی صدرنیا، سارا صبوری راد

مجله مهندسی برق و الکترونیک ایران، سال هفدهم شماره 3 (پاییز 1399)، صص 25 -37

سرطان پوست یکی از خطرناک ترین سرطان هایی است که همه ساله افراد زیادی به آن مبتلا می شوند. به همین دلیل تشخیص و درمان سریع این سرطان بسیار برای پزشکان حایز اهمیت می باشد، در چند دهه اخیر برای بهبود تشخیص و درمان این بیماری استفاده از روش های هوشمند بسیار مورد توجه قرار گرفته است. هدف اصلی در این مقاله، تعیین مقدار بهینه دارو برای از بین بردن سلول های سرطانی می باشد به گونه ای که از تاثیر سوء دارو بر روی سلول های سالم جلوگیری شود. از الگوریتم یادگیری Q بدین منظور استفاده شده است. برای انتخاب اعمال، از سیاست استدلال مبتنی بر مورد با نام اختصاری CBR که یک نوع سیاست اکتشافی شتاب داده شده می باشد، استفاده گشته است که باعث افزایش سرعت یادگیری و کاهش زمان، برای رسیدن به سیاست بهینه می شود. مورد دیگری که در این مقاله لحاظ شده است، تاثیر نیمه عمر دارو برای بدست آوردن اثر دارو در هر لحظه در بدن بیمار می باشد. برای اینکه عملکرد روش یادگیری تقویتی در کنترل سلول های سرطانی و تعیین میزان بهینه دوز دارو بهتر نشان داده شود، این روش با یکی از روش های کنترل بهینه به نام روش همیلتونین و روش تزریق دوز داروی ثابت مقایسه شده است. در نهایت نشان داده شده است مجموع دوز داروی تزریقی به بیمار با استفاده از روش یادگیری تقویتی در مقایسه با حالتی که از روش کنترل بهینه و دوز داروی ثابت برای تمام زمان ها استفاده شده است، بسیار کاهش پیدا کرده است و در ضمن جمعیت سلول های سرطانی نیز کنترل شده است. با اعمال نویز و عدم قطعیت در پارامترهای سیستم و شرایط اولیه باز هم روش انتخابی قادر به کنترل سلول های سرطانی می باشد.

کلید واژگان: سرطان ملانوما، الگوریتم یادگیری Q، سیاست استدلال مبتی بر مورد، اثرات سوء دارو، نیمه عمر دارو، کنترل بهینه

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Controlling the Cancer Cells in a Nonlinear Model of Melanoma by Considering the Uncertainty Using Q-learning Algorithm Under the Case Based Reasoning Policy

Amin Noori*, Elnaz Kalhor, MohammadAli Sadrnia, Sara Saboori Rad

Journal of Iranian Association of Electrical and Electronics Engineers, Volume:17 Issue: 3, 2020, PP 25 -37

Melanoma is one of the most dangerous types of cancers and every year, many people suffer from this cancer. Hence, quick diagnosis and treatment are significantly important for the physicians. In the recent decade, intelligent methods have attracted considerable attention for diagnosing and treating the melanoma. The main objective of this paper is determining the optimal dosage of the drug for the elimination of the cancer cells while preventing from the side effect of the drug on the normal cells. To this aim, the Q-learning algorithm is employed. In order to select the actions, a Case-Based Reasoning (CBR) policy is used, which is an accelerated heuristic policy. The considered policy has increased the learning speed and reduced the overall time, to reach the optimal policy. The half-life effect of the drug is also considered to obtain the side effect of the drug on the patientchr('39')s body, at each time step. In order to demonstrate Q-learning algorithm performance in cancer cells control and optimal dosage determination purposes, Q-learning is compared with two methods, including fix dosage injection method and Hamiltonian method, which is one of the most important optimal control methods. Finally, it is revealed that the total injected dosage by using Reinforcement Learning method (Q-learning) is significantly reduced within the whole period of time in comparison with employing the optimal control and a fixed dosage injection cases. The number of cancer cells is controlled, as well. It should be noted that by applying the noise and uncertainty to the system parameters and the initial conditions, the proposed method can successfully control the cancer cells.

Keywords: melanoma cancer, Q-learning algorithm, case based reasoning, side effect of the drug, half-life of drug, optimal control

Abstract View Paper Research/Original Article Original: Persian

نکته

نتایج بر اساس تاریخ انتشار مرتب شده‌اند.
کلیدواژه مورد نظر شما تنها در فیلد کلیدواژگان مقالات جستجو شده‌است. به منظور حذف نتایج غیر مرتبط، جستجو تنها در مقالات مجلاتی انجام شده که با مجله ماخذ هم موضوع هستند.
در صورتی که می‌خواهید جستجو را در همه موضوعات و با شرایط دیگر تکرار کنید به صفحه جستجوی پیشرفته مجلات مراجعه کنید.

به جمع مشترکان مگیران بپیوندید!

q-learning algorithm