object detection
در نشریات گروه برق-
نویسه خوانی نوری (OCR) در خط های شکسته، که در آن حروف یک کلمه به هم چسبیده هستند و در جهت های افقی و عمودی با هم همپوشانی دارند، با چالش های زیادی در هنگام جداسازی نویسه های تشخیص داده نشده و تشخیص نویسه های جدا نشده روبه رو می شود. در این مقاله، ما استفاده از مدل های تشخیص شیء را برای تشخیص نویسه ها در خط های شکسته پیشنهاد می کنیم. سادگی اجرا و کارایی این روش در شناخت قلم های سبک دست نویس بررسی خواهد شد. در این پژوهش از شبکه یولو برای جداسازی و طبقه بندی نویسه های کلمات دلخواه سه حرفی در خط فارسی به عنوان مطالعه موردی استفاده شده است. در ابتدا مجموعه داده مناسب برای شبکه یولو را از قلم های فارسی با سبک دست نویس مانند مانلی و ایران نستعلیلق تولید کردیم. با استفاده از شبکه یولو به دقت بالای 98.5٪ در تشخیص نویسه های قلم مانلی و 97.6٪ برای ترکیب کلمات در قلم های مانلی و ایران نستعلیق دست یافتیم. سپس، آستانه دقت مدل پیشنهادی را با اضافه کردن نویز، تاری و چولگی به نمونه ها به چالش کشیدیم. علاوه بر این، ما از یک مدل پرسپترون چند لایه (MLP) برای پیش بینی کلمات از نویسه های شناسایی شده و مکان یابی شده توسط یولو با دقت بیش از 97.7٪ استفاده کردیم. این رویکرد ما را قادر می سازد تا بدون استفاده از لغت نامه فارسی، کلمات کامل با قلم های پیچیده به سبک دست نویس را به طور دقیق تشخیص دهیم.
کلید واژگان: نویسه خوانی نوری (OCR)، تشخیص شیء، شبکه یولو، شبکه عصبی پرسپترون چند لایه (MLP)، خط فارسی، قلم های سبک دست نویسOptical Character Recognition (OCR) in cursive scripts, where the letters of a word are joined in a flowing manner and overlap in both directions, deals with the struggles raised while segmentation of unrecognized characters and recognition of unseparated characters. In this paper, we propose using object detection models for character detection in cursive scripts. Simplicity of implementation and efficiency of this method in recognition of handwriting-style fonts are investigated and discussed. Here, YOLO model is used to separate and classify the characters of arbitrary three-letter words in Persian script as a case study. Initially, we generated synthetic datasets suitable for the YOLO network from handwriting-style Persian fonts, such as Maneli and IranNastaliq. By using the YOLO model, we achieved high Precision of 98.5% in character detection of Maneli font and 97.6% for a mixture of words in Maneli and IranNastaliq fonts, while the accuracy for the regular font Arial was almost 100%. Then, we challenged the proposed model by adding noise, blur, and skewness to the samples. Furthermore, we utilized a multi-layer perceptron (MLP) model to predict the words from the characters detected and localized by YOLO with the accuracy of 99.8% for Maneli font and 97.7% for a mixture of words in Maneli and IranNastaliq fonts, while the word detection accuracy for the regular font Arial was almost 100%. This approach enables us to recognize complete words accurately from complex handwriting-style fonts, without using a Persian vocabulary dictionary.
Keywords: Optical Character Recognition (OCR), Object Detection, YOLO Model, Multi-Layer Perceptron (MLP), Persian Script, Handwriting-Style Fonts -
Scientia Iranica, Volume:31 Issue: 14, Jul-Aug 2024, PP 1105 -1121The issue of Automatic License Plate Recognition (ALPR) has been a challenging one in recent years, with various factors such as weather conditions, camera angle, lighting, and different characters on license plates. However, thanks to the advances made in deep neural networks, it is now possible to use specific types of neural networks and models to recognize Iranian license plates. In the proposed method, license plate recognition is done in two steps. Firstly, the license plates are detected from the input image using the YOLOv4-tiny model, which is based on the Convolutional Neural Network (CNN). Secondly, the characters on the license plates are recognized using the Convolutional Recurrent Neural Network (CRNN) and Connectionist Temporal Classification (CTC). With no need to segment and label the characters separately, one string of numbers and letters is enough for the labels. The successful training of the models involved using 3065 images of license plates and 3364 images of license plate characters as the desired datasets. The proposed method boasts an average response time of 0.0074 seconds per image and 141 frames per second (fps) in the Darknet framework and 0.128 seconds per image in the TensorFlow framework for the License Plate Detection (LPD) part.Keywords: YOLO, CTC, CRNN, Tensorflow, Darknet, Object Detection, Automatic License Plate Recognition
-
Providing a dataset with a suitable volume and high accuracy for training deep neural networks is considered to be one of the basic requirements in that a suitable dataset in terms of the number and quality of images and labeling accuracy can have a great impact on the output accuracy of the trained network. The dataset presented in this article contains 3000 images downloaded from online Iranian car sales companies, including Divar and Bama sites, which are manually labeled in three classes: car, truck, and bus. The labels are in the form of 5765 bounding boxes, which characterize the vehicles in the image with high accuracy, ultimately resulting in a unique dataset that is made available for public use.The YOLOv8s algorithm, trained on this dataset, achieves an impressive final precision of 91.7% for validation images. The Mean Average Precision (mAP) at a 50% threshold is recorded at 92.6%. This precision is considered suitable for city vehicle detection networks. Notably, when comparing the YOLOv8s algorithm trained with this dataset to YOLOv8s trained with the COCO dataset, there is a remarkable 10% increase in mAP at 50% and an approximately 22% improvement in the mAP range of 50% to 95%.
Keywords: Dataset, Object Detection, YoloV8s, Vehicle Dataset, Deep Neural Network -
Scientia Iranica, Volume:30 Issue: 3, May-June 2023, PP 1058 -1067During the COVID-19 pandemic, wearing a face mask has been known to be an effective way to prevent the spread of COVID-19. In lots of monitoring tasks, humans have been replaced with computers thanks to the outstanding performance of the deep learning models. Monitoring the wearing of a face mask is another task that can be done by deep learning models with acceptable accuracy. The main challenge of this task is the limited amount of data because of the quarantine. In this paper, we did an investigation on the capability of three state-of-the-art object detection neural networks on face mask detection for real-time applications. As mentioned, here are three models used, SSD, two versions of YOLO i.e., YOLOv4-tiny, and YOLOv4-tiny-3l from which the best was selected. In the proposed method, according to the performance of different models, the best model that can be suitable for use in real-world and mobile device applications in comparison to other recent studies was the YOLOv4-tiny model, with 85.31% and 50.66 for mAP and FPS, respectively. These acceptable values were achieved using two datasets with only 1531 images in three separate classes, “with mask”, “without mask”, and “incorrect mask”.Keywords: Covid-19, Deep Learning, Object Detection, Face Mask, convolutional neural networks
-
We propose a real-time Yolov5 based deep convolutional neural network for detecting ships in the video. We begin with two famous publicly available SeaShip datasets each having around 9,000 images. We then supplement that with our self-collected dataset containing another thirteen thousand images. These images were labeled in six different classes, including passenger ships, military ships, cargo ships, container ships, fishing boats, and crane ships. The results confirm that Yolov5s can classify the ship's position in real-time from 135 frames per second videos with 99 % precision.
Keywords: convolutional neural network, Yolov5, object detection, ship detection
- نتایج بر اساس تاریخ انتشار مرتب شدهاند.
- کلیدواژه مورد نظر شما تنها در فیلد کلیدواژگان مقالات جستجو شدهاست. به منظور حذف نتایج غیر مرتبط، جستجو تنها در مقالات مجلاتی انجام شده که با مجله ماخذ هم موضوع هستند.
- در صورتی که میخواهید جستجو را در همه موضوعات و با شرایط دیگر تکرار کنید به صفحه جستجوی پیشرفته مجلات مراجعه کنید.