به جمع مشترکان مگیران بپیوندید!

تنها با پرداخت 70 هزارتومان حق اشتراک سالانه به متن مقالات دسترسی داشته باشید و 100 مقاله را بدون هزینه دیگری دریافت کنید.

برای پرداخت حق اشتراک اگر عضو هستید وارد شوید در غیر این صورت حساب کاربری جدید ایجاد کنید

عضویت
فهرست مطالب نویسنده:

s. mohamadzadeh

  • E. Sahragard, H. Farsi *, S. Mohamadzadeh
    In the field of computer vision, semantic segmentation became an important problem that has applications in fields such as autonomous driving and robotics. Image segmentation datasets, on the other hand, present substantial hurdles due to the high intra-class variability, which includes differences across car models or building designs, and the low inter-class variability, which makes it difficult to discern between objects such as buildings that have facades that are visually identical. A focus-enhanced ASPP module that is coupled with an upgraded backbone for semantic segmentation networks is presented in this study as a solution to the problems that have been identified. In order to augment the adaptability of extracted features, the proposed framework utilizes the capability of an attention ASPP module to implement attention processes within the multiscale module. In order to efficiently capture complex features, the encoder stage also makes use of a ResNet-50 backbone that has been properly optimized. In addition, to increase the robustness of the model, data augmentation approaches are applied. mDice of 87.82, mIoU of 79.05, and mean accuracy of 85.2 on the Stanford dataset, and mDice of 88.91, mIoU of 80.03, and mean accuracy of 89.84 on the Cityscapes dataset, according to experimental assessments, demonstrate that the developed technique performs at an accuracy level that is believed to be modern. As a result of these findings, the possibility for greatly improving semantic segmentation performance may be highlighted by integrating attention mechanisms, ASPP modules, and upgraded ResNet structures.
    Keywords: Semantic Segmentation, Efficient Channel Attention, Atrous Spatial Pyramid Pooling, Improved Resnet, Dilation Convolution
  • M. Sharif-Noughabi, S. M. Razavi, S. Mohamadzadeh *
    One of the methods that have gained attention in recent years is the extraction of Mel-spectrogram images from speech signals and the use of speaker recognition systems. This permits us to utilize existing image recognition methods for this purpose. Three-second segments of the speech are randomly chosen in this paper and then the Mel-spectrogram image of that segment is derived. These images are inputted into a proposed convolutional neural network that has been designed and optimized based on VGG-13. Compared to similar tasks, this optimized classifier has fewer parameters, and it trains faster and has a higher level of accuracy. For the voxceleb1 dataset with 1251 speakers, the accuracy of top-1 = 84.25% and top-5 = 94.33% has been achieved. In addition, various methods have been employed to augment data based on these images, ensuring the speech's nature remains intact, and in most cases, it improves the system's performance. The utilization of data agumentation techniques, such as flip horizontal and time shifting of images or ES technique, led to an increase in top-1 to 91.17% and top-5 to 97.32%. Moreover, by employing the Dropout layer output of the proposed neural network as a feature vector during training of the GMM-UBM model, the EER rate in the speaker verification system is decreased. These features reduce the EER value by 9% for the MFCC feature to 3.5%.
    Keywords: Speaker Recognition, VGG Convolutional Neural Network, Mel-Spectrogram Images, Data Augmentation
  • M. Rohani, H. Farsi *, S. Mohamadzadeh
    The rapid advancement of computer vision algorithms demands efficient computational resource utilization for practical applications. This study proposes a novel framework that integrates multi-task learning (MTL) with MobileNetV3-Large networks and multi-head attention (MHA) mechanisms to simultaneously estimate facial attributes, including age, gender, race, and emotions. By employing MHA, the model enhances feature extraction and representation by focusing on multiple regions of the input image, thereby reducing computational complexity while significantly improving accuracy. The Receptive Field Enhanced Multi-Task Cascaded (RFEMTC) technique is utilized for effective preprocessing of the input data. Our methodology is rigorously evaluated on the UTKFace, FairFace, and RAF-DB datasets. We introduce a weighted loss function to balance task contributions, enhancing overall performance. Through refinement of the network architecture by analyzing branching points and optimizing the balance between shared and task-specific layers, our experimental results demonstrate significant improvements: a 7% reduction in parameters, a 3% increase in gender detection accuracy, a 5% improvement in race detection accuracy, and a 6% enhancement in emotion detection accuracy compared to single-task methods. Additionally, our proposed architecture reduces age estimation error by approximately one year on the UTKFace dataset and improves age estimation accuracy on the FairFace dataset by 5% compared to state-of-the-art approaches.
    Keywords: Facial Attribute Estimation, Convolutional Neural Network, Multi-Task Learning, Preprocessing, Multi-Head Attention
  • S. Fooladi, H. Farsi, S. Mohamadzadeh *
    Background and Objectives
    The increasing prevalence of skin cancer highlights the urgency for early intervention, emphasizing the need for advanced diagnostic tools. Computer-assisted diagnosis (CAD) offers a promising avenue to streamline skin cancer screening and alleviate associated costs.
    Methods
    This study endeavors to develop an automatic segmentation system employing deep neural networks, seamlessly integrating data manipulation into the learning process. Utilizing an encoder-decoder architecture rooted in U-Net and augmented by wavelet transform, our methodology facilitates the generation of high-resolution feature maps, thus bolstering the precision of the deep learning model.
    Results
    Performance evaluation metrics including sensitivity, accuracy, dice coefficient, and Jaccard similarity confirm the superior efficacy of our model compared to conventional methodologies. The results showed a accuracy of %96.89 for skin lesions in PH2 Database and %95.8 accuracy for ISIC 2017 database findings, which offers promising results compared to the results of other studies. Additionally, this research shows significant improvements in three metrics: sensitivity, Dice, and Jaccard. For the PH database, the values are 96, 96.40, and 95.40, respectively. For the ISIC database, the values are 92.85, 96.32, and 95.24, respectively.
    Conclusion
    In image processing and analysis, numerous solutions have emerged to aid dermatologists in their diagnostic endeavors The proposed algorithm was evaluated using two PH datasets, and the results were compared to recent studies. Impressively, the proposed algorithm demonstrated superior performance in terms of accuracy, sensitivity, Dice coefficient, and Jaccard Similarity scores when evaluated on the same database images compared to other methods.
    Keywords: U-Net, Segmentation, Skin Lesion, Deep Neural Networks, Medical Image
  • H. Farsi *, S. Noursoleimani, S. Mohamadzadeh, A. Barati
    Early detection of skin lesions is essential for the success of treatment depending on the earliest possible detection of skin cancer lesions. Segmentation of skin cancer lesions is one of the most important early steps. In this regard, classic U-Net which is based on deep neural networks is the most popular architecture for medical image segmentation. However, the classic U-Net architecture lacks certain aspects. In this approach, we proposed a lightweight model designed to minimize memory usage in the deeper network layers and to reduce training and testing time. We achieved this by leveraging Multi-Level Blocks, which exclusively utilized 3x3 convolution operations. Additionally, we have utilized multiple convolutions to facilitate the transfer of information from the encoding to the decoding stage. This approach aims to minimize the semantic gap between the two stages. We have termed this information transfer path the encoder-decoder path. Our method has demonstrated outstanding performance in key metrics when tested on the PH2 dataset and has shown superior performance in terms of Accuracy and Jaccard Index on the ISIC-2017 dataset compared to the latest methods reported in existing publications. The Multi-Path U-Net method effectively recognizes and precisely segments complex features such as weak boundaries, shape, and color irregularities, and multi-part lesions with diverse color intensities.
    Keywords: Skin Lesion Segmentation, U-Net, Convolutional Neural Networks, Medical Images
  • H. Farsi *, D. Ghermezi, A. Barati, S. Mohamadzadeh
    In recent decades, the advancement of deep learning algorithms and their effectiveness in saliency detection has garnered significant attention in research. Among these methods, U Network ( U-Net ) is widely used in computer vision and image processing. However, most previous deep learning-based saliency detection methods have focused on the accuracy of salient regions, often overlooking the quality of boundaries, especially fine boundaries. To address this gap, we developed a method to detect boundaries effectively. This method comprises two modules: prediction and residual refinement, based on U-Net structure. The refinement module improves the mask predicted by the prediction module. Additionally, to boost the refinement of the saliency map, a channel attention module is integrated. This module has a significant impact on our proposed method. The channel attention module is implemented in the refinement module, aiding our network in obtaining a more accurate estimation by focusing on the crucial and informative regions of the image. To evaluate the developed method, five well-known saliency detection datasets are employed. The proposed method consistently outperforms the baseline method across all five datasets, demonstrating improved performance.
    Keywords: Saliency Detection, Deep Learning, U-Net Network, Channel Attention Module
  • M. Rohani, H. Farsi *, S. Mohamadzadeh
    Facial feature recognition (FFR) has witnessed a remarkable surge in recent years, driven by its extensive applications in identity recognition, security, and intelligent imaging. The UTKFace dataset plays a pivotal role in advancing FFR by providing a rich dataset of facial images with accurate age, gender, and race labels. This paper proposes a novel multi-task learning (MTL) model that leverages the powerful Efficient-Net architecture and incorporates attention-based learning with two key innovations. First, we introduce an age-specific loss function that minimizes the impact of errors in less critical cases while focusing the learning process on accurate age estimation within sensitive age ranges. This innovation is trained using the UTKFace dataset and is specifically optimized to improve accuracy in age estimation across different age groups. Second, we present an enhanced attention mechanism that guides the model to prioritize features that contribute to more robust FFR. This mechanism is trained on the diverse and challenging images of UTKFace and is capable of identifying subtle and discriminative features in faces for more accurate gender, race, and age recognition. Furthermore, our proposed method achieves a 30% reduction in model parameters compared to the baseline network while maintaining accuracy. Extensive comparisons with existing state-of-the-art methods demonstrate the efficiency and effectiveness of our proposed approach. Using the UTKFace dataset as the evaluation benchmark, our model achieves a 0.62% improvement in gender recognition accuracy, a 2.35% improvement in race recognition accuracy, and a noteworthy 3.23-year reduction in mean absolute error for age estimation.
    Keywords: Age Estimation, Attention Based Learning, Convolutional Neural Network, Gender Recognition, Multi-Task Learning, Race Classification
  • A. Gheitasi, H. Farsi, S. Mohamadzadeh *
    Background and Objectives
    Freehand sketching is an easy-to-use but effective instrument for computer-human connection. Sketches are highly abstract to the domain gap, that exists between the intended sketch and real image. In addition to appearance information, it is believed that shape information is also very efficient in sketch recognition and retrieval.
    Methods
    In the realm of machine vision, comprehending Freehand Sketches has grown more crucial due to the widespread use of touchscreen devices. In addition to appearance information, it is believed that shape information is also very efficient in sketch recognition and retrieval. The majority of sketch recognition and retrieval methods utilize appearance information-based tactics. A hybrid network architecture comprising two networks—S-Net (Sketch Network) and A-Net (Appearance Network)—is shown in this article under the heading of hybrid convolution. These subnetworks, in turn, describe appearance and shape information. Conversely, a module known as the Conventional Correlation Analysis (CCA) technique module is utilized to match the range and enhance the sketch retrieval performance to decrease the range gap distance. Finally, sketch retrieval using the hybrid Convolutional Neural Network (CNN) and CCA domain adaptation module is tested using many datasets, including Sketchy, Tu-Berlin, and Flickr-15k. The final experimental results demonstrated that compared to more sophisticated methods, the hybrid CNN and CCA module produced high accuracy and results.
    Results
    The proposed method has been evaluated in the two fields of image classification and Sketch Based Image Retrieval (SBIR). The proposed hybrid convolution works better than other basic networks. It achieves a classification score of 84.44% for the TU-Berlin dataset and 82.76% for the sketchy dataset. Additionally, in SBIR, the proposed method stands out among methods based on deep learning, outperforming non-deep methods by a significant margin.
    Conclusion
    This research presented the hybrid convolutional framework, which is based on deep learning for pattern recognition. Compared to the best available methods, hybrid network convolution has increased recognition and retrieval accuracy by around 5%. It is an efficient and thorough method which demonstrated valid results in Sketch-based image classification and retrieval on TU-Berlin, Flickr 15k, and sketchy datasets.
    Keywords: Sketch Based Image Retrieval (SBIR), Hybrid CNN, Domain Adaptation, Deep Learning
  • E. Ghasemi Bideskan, S.M. Razavi, S. Mohamadzadeh *, M. Taghippour
    Background and Objectives
    The recognition of facial expressions using metaheuristic algorithms is a research topic in the field of computer vision. This article presents an approach to identify facial expressions using an optimized filter developed by metaheuristic algorithms.
    Methods
    The entire process of feature extraction hinges on using a filter optimally configured by metaheuristic algorithms. Essentially, the purpose of utilizing this metaheuristic algorithm is to determine the optimal weights for feature extraction filters. Once the optimal weights for the filter have been determined by the metaheuristic algorithm, optimal filter sizes have also been determined. As an initial step, the k-nearest neighbor classifier is employed due to its simplicity and high accuracy. Following the initial stage, a final model is presented, which integrates results from both filterbank and Multilayer Perceptron neural networks.
    Results
    An analysis of the existing instances in the FER2013 database has been conducted using the method proposed in this article. This model achieved a recognition rate of 78%, which is superior to other algorithms and methods while requiring less training time than other algorithms and methods.In addition, the JAFFE database, a Japanese women's database, was utilized for validation. On this dataset, the proposed approach achieved a 94.88% accuracy rate, outperforming other competitors.
    Conclusion
    The purpose of this article is to propose a method for improving facial expression recognition by using an optimized filter, which is implemented through a metaheuristic algorithm based on the KA. In this approach, optimized filters were extracted using the metaheuristic algorithms kidney, k-nearest neighbor, and multilayer perceptron. Additionally, by employing this approach, the optimal size and number of filters for facial state recognition were determined in order to achieve the highest level of accuracy in the extraction process.
    Keywords: Optimal Filter, Kidney Algorithm, Nearest Neighbor Classification, Neural Network, Facial Expression Recognition
  • E. Sahragard, H. Farsi *, S. Mohamadzadeh

    Drone semantic segmentation is a challenging task in computer vision, mainly due to inherent complexities associated with aerial imagery. This paper presents a comprehensive methodology for drone semantic segmentation and evaluates its performance using the ICG dataset. The proposed method leverages hierarchical multi-scale feature extraction and efficient channel-based attention Atrous Spatial Pyramid Pooling (ASPP) to address the unique challenges encountered in this domain. In this study, the performance of the proposed method is compared to several state-of-the-art models. The findings of this research highlight the effectiveness of the proposed method in tackling the challenges of drone semantic segmentation. The outcomes demonstrate its superiority over the state-of-the-art models, showcasing its potential for accurate and efficient segmentation of aerial imagery. The results contribute to the advancement of drone-based applications, such as surveillance, object tracking, and environmental monitoring, where precise semantic segmentation is crucial. The obtained experimental results demonstrate that the proposed method outperforms these existing approaches regarding Dice, mIOU, and accuracy metrics. Specifically, the proposed method achieves an impressive performance with Dice, mIOU, and accuracy scores of 86.51%, 76.23%, and 91.74%, respectively.

    Keywords: Semantic Drone Segmentation, Hierarchical Multi-Scale Feature Extraction, Efficient Channel-Based Attention, Atrous Spatial Pyramid Pooling
  • S. Mohamadzadeh *, M. Ghayedi, S. Pasban, A. K. Shafiei
    One of the most serious causes of disease in the world's population, which kills many people worldwide every year, is heart attack. Various factors are involved in this matter, such as high blood pressure, high cholesterol, abnormal pulse rate, diabetes, etc. Various methods have been proposed in this field, but in this article, by using sparse codes in the classification process, higher accuracy has been achieved in predicting heart attacks. The proposed method consists of two parts: preprocessing and sparse code processing. The proposed method is resistant to noise and data scattering because it uses a sparse representation for this purpose. The spars allow the signal to be displayed at its lowest value, which leads to improve computing speed and reduce storage requirements. To evaluate the proposed method, the Cleveland database has been used, which includes 303 samples and each sample has 76 features. Only 13 features are used in the proposed method. FISTA, AMP, DALM and PALM classifiers have been used for the classification process. The accuracy of the proposed method, especially with the PALM classifier, is the highest among other classifiers with 96.23%, and the other classifiers are 95.08%, 94.11% and 94.52% for DALM, AMP, FISTA, respectively.
    Keywords: Heart attack, Classification, prediction, Machine Learning, Sparse representation
  • A. Sezavar, H. Farsi *, S. Mohamadzadeh
    Person re-identification (re-id) is one of the most critical and challenging topics in image processing and artificial intelligence. In general, person re-identification means that a person seen in the field of view of one camera can be found and tracked by other non-overlapped cameras. Low-resolution frames, high occlusion in crowded scene, and few samples for training supervised models make re-id challenging. This paper proposes a new model for person re-identification to overcome the noisy frames and extract robust features from each frame. To this end, a noise-aware system is implemented by training an auto-encoder on artificially damaged frames to overcome noise and occlusion. A model for person re-identification is implemented based on deep convolutional neural networks. Experimental results on two actual databases, CUHK01 and CUHK03, demonstrate that the proposed method performs better than state-of-the-art methods.
    Keywords: auto-encoder, Deep Learning, Image Hashing, person re-identification
  • M. Rohani, H. Farsi, S. Mohamadzadeh

    Facial feature recognition is an important subject in computer vision with numerous applications. The human face plays a significant role in social interaction and personology. Valuable information such as identity, age, gender, and emotions can be revealed via facial features. The purpose of this paper is to present a technique for detecting age, smile, and gender from facial images. A multi-task deep learning (MT-DL) framework was proposed that can simultaneously estimate three important features of the human face with remarkable accuracy. Additionally, the proposed approach aims to reduce the number of trainable network parameters while leveraging the combination of features from different layers to increase the overall accuracy. The conducted tests demonstrate that the proposed method outperforms recent advanced techniques in all three accuracy criteria. Moreover, it was demonstrated that multi-task learning (MTL) is capable of improving the accuracy by 1.55% in the smile task, 2.04% in the gender task, and 3.52% in the age task even with less available data, by utilizing tasks with more available data. Furthermore, the trainable parameters of the network in the MTL mode for estimating three tasks simultaneously increase only by about 40% compared to the single-task mode. The proposed method was evaluated on the IMDB-WIKI and GENKI-4K datasets and produced comparable accuracy to the state-of-the-art methods in terms of smile, age detection, and gender classification.

    Keywords: Age Detection, Convolutional Neural Networks, Gender Classification, Multi-task Learning, Smile Detection
  • S. Fooladi, H. Farsi, S. Mohamadzadeh

    Brain tumor Segmentation is one of the most crucial methods of medical image processing. Non-automatic segmentations are broadly used in clinical diagnosis and medication. However, this kind of segmentation does not have accuracy in medical images, especially in terms of brain tumors, and it provides a low level of reliability. The primary objective of this paper is to develop a methodology for brain tumor segmentation. In this paper, a combination of Convolutional Neural Network and Fuzzy K-means algorithm has been presented to segment the lesion area of brain tumor. It contains three phases, Image preprocessing to reduce computational complexity, Attribute extraction and selection and Segmentation. At first, the database images are pre-processed using adaptive filters and wavelet transform in order to recover the image from the noise state and reduce the computational complexity. Then feature extraction is performed by the proposed deep neural network. Finally, it is processed through the Fuzzy K-Means algorithm to segment the tumor region separately. The innovation of this article is related to the implementation of deep neural network with optimal parameters, identification of related features and removal of unrelated and repetitive features with the aim of observing a subset of features that describe the problem well and with minimal reduction in efficiency. This results in reduced feature sets, storage of data collection resources during operation, and overall data reduction to limit storage requirements. This proposed segmentation approach has been verified on BRATS dataset and produces the accuracy of 98.64%, sensitivity of 100% specificity of 99%.

    Keywords: Brain Tumor, Convolutional Neural Networks, Fuzzy K-Means, Segmentation
  • S.M. Notghimoghadam, H. Farsi *, S. Mohamadzadeh
    Background and Objectives

     Object detection has been a fundamental issue in computer vision. Research findings indicate that object detection aided by convolutional neural networks (CNNs)‌ is still in its infancy despite -having outpaced other methods.

    Methods

     This study proposes a straightforward, easily implementable, and high-precision object detection method that can detect objects ‌‌‌with minimum least error. ‌Object detectors generally fall into one-stage ‌‌and two-stage‌ detectors‌. Unlike one-stage detectors, two-stage detectors ‌are often more precise, despite performing at a lower speed. In this study, a one-stage‌ detector is proposed, and the results indicated its sufficient precision. The proposed method uses a feature pyramid network ‌(FPN) to detect objects on multiple scales. This network is combined with the ResNet 50 deep neural network.

    Results

     The proposed method is trained and tested on ‌Pascal VOC 2007 and COCO datasets. It yields a mean average precision (mAP) of 41.91 in Pascal Voc2007 and 60.07% in MS COCO. The proposed method is tested under additive noise. The test images of the datasets are combined with the salt and pepper noise to obtain the value of mAP for different noise levels up to 50% for Pascal VOC and MS COCO datasets. The investigations show that the proposed method provides acceptable results.

    Conclusion

     It can be concluded that using deep learning algorithms and CNNs and combining them with a feature network can significantly enhance object detection precision.

    Keywords: Object Recognition, Deep Learning, Convolutional Neural Networks, Object Classification
  • Shadow Removal in Vehicle Detection Using ResUNet-a
    Z. Dorrani, H. Farsi, S. Mohamadzadeh

    In traffic monitoring for video analysis systems, vehicle shadows have a negative effect on their performance. Shadow detection and removal are essential steps in accurate vehicle detection. In this paper, a new method is proposed for shadow detection using a novel convolution neural network architecture. In the proposed method, the edges of the image are first extracted. Edge extraction reduces calculation, and accelerates the execution of the method. The background of the frame is then removed and the main features are extracted using the ResUNet-a architecture. This architecture consists of two parts: the encoder and the decoder, which detect the shadow at the decoder output and then remove it. Deep learning is used to detect shadows, which increases the accuracy of the analysis. The ResUNet-a architecture can learn complex, hierarchical, and appropriate features from the image for accurate feature detection and discarding the irrelevant shadow, thereby outperforming conventional filters.The results show that the proposed method provides better performance on NJDOT traffic video, highway-1, and highway-3 datasets than popular shadow removal methods. Also, the method improves the evaluation criteria such as F-measure and runtime. The F-measure is 94 and 93% for highway-1 and highway-3, respectively.

    Keywords: Deep convolutional neural network, Deep learning, ResUNet-a, Shadow removal, Vehicle detection
  • A. Akbari, H. Farsi *, S. Mohamadzadeh
    Background and Objectives

    Video processing is one of the essential concerns generally regarded over the last few years. Social group detection is one of the most necessary issues in crowd. For human-like robots, detecting groups and the relationship between members in groups are important. Moving in a group, consisting of two or more people, means moving the members of the group in the same direction and speed.

    Methods

    Deep neural network (DNN) is applied for detecting social groups in the proposed method using the parameters including Euclidean distance, Proximity distance, Motion causality, Trajectory shape, and Heat-maps. First, features between pairs of all people in the video are extracted, and then the matrix of features is made. Next, the DNN learns social groups by the matrix of features.

    Results

    The goal is to detect two or more individuals in social groups. The proposed method with DNN and extracted features detect social groups. Finally, the proposed method’s output is compared with different methods.

    Conclusion

    In the latest years, the use of deep neural networks (DNNs) for learning and detecting has been increased. In this work, we used DNNs for detecting social groups with extracted features. The indexing consequences and the outputs of movies characterize the utility of DNNs with extracted features.  The author(s). This is an open access article distributed under the terms of the Creative Commons Attribution (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, as long as the original authors and source are cited. No permission is required from the authors or the publishers.

    Keywords: Social group detection, Deep neural network, Feature extraction, Video Processing
  • Z. Dorrani, H. Farsi *, S. Mohamadzadeh
    Searching and optimizing by using collective intelligence are known as highly efficient methods that can be used to solve complex engineering problems. Ant colony optimization algorithm (ACO) is based on collective intelligence inspired by ants' behavior in finding the best path in search of food. In this paper, the ACO algorithm is used for image edge detection. A fuzzy-based system is proposed to increase the dynamics and speed of the proposed method. This system controls the amount of pheromone and distance. Thus, instead of considering constant values for the parameters of the algorithm, variable values are used to make the search space more accurate and reasonable. The fuzzy ant colony optimization algorithm is applied on several images to illustrate the performance of the proposed algorithm. The obtained results show better quality in extracting edge pixels by the proposed method compared to several image edge detection methods. The improvement of the proposed method is shown quantitatively by the investigation of the time and entropy of conventional methods and previous works. Also, the robustness of the proposed method is demonstrated against additive noise.
    Keywords: Ant Colony Optimization Algorithm, Edge detection, Fuzzy System
  • A. Gheitasi, H. Farsi *, S. Mohamadzadeh
    Hand posture estimation attracts researchers because of its many applications. Hand posture recognition systems simulate the hand postures by using mathematical algorithms. Convolutional neural networks have provided the best results in the hand posture recognition so far. In this paper, we propose a new method to estimate the hand skeletal posture by using deep convolutional neural networks. To simplify the proposed method and to be more functional, the depth factor is ignored. So only the simple color images of hands are used as inputs of the system. The proposed method is evaluated by using two datasets with high-diversity named Mixamo and RWTH, which include 43,986 and 1160 color images, respectively, where 74% of these images are selected as a training set and, 26% of the rest images are selected as the evaluation set. The experiments show that the proposed method provides better results in both hand posture recognition and detection of sign languages compared to state-of-the-art methods.
    Keywords: Deep convolutional neural network, Deep Learning, Hand Posture Recognition, Skeletal Estimation
بدانید!
  • در این صفحه نام مورد نظر در اسامی نویسندگان مقالات جستجو می‌شود. ممکن است نتایج شامل مطالب نویسندگان هم نام و حتی در رشته‌های مختلف باشد.
  • همه مقالات ترجمه فارسی یا انگلیسی ندارند پس ممکن است مقالاتی باشند که نام نویسنده مورد نظر شما به صورت معادل فارسی یا انگلیسی آن درج شده باشد. در صفحه جستجوی پیشرفته می‌توانید همزمان نام فارسی و انگلیسی نویسنده را درج نمایید.
  • در صورتی که می‌خواهید جستجو را با شرایط متفاوت تکرار کنید به صفحه جستجوی پیشرفته مطالب نشریات مراجعه کنید.
درخواست پشتیبانی - گزارش اشکال