Written language Classification in multilingual documents

Message:
Abstract:
Optical character recognition is one of the working areas in pattern recognition. Each year the conference papers related to the topic in artificial intelligence, pattern rec¬ognition, image processing, machine vision, and. .. Is presented. However, Due to the inherent complexity of languages in the world, still very interested in the subject mat¬ter want to identify the texts with better results. Researchers have presented Many algorithms to convert text images and non editable text into editable by the computer. Many articles say that the written language has its own characteristics, can only iden¬tify a document type that has one language. In view documents, there are several things that a document containing two or more different languages. Therefore, Docu¬ment identification systems require identification several languages simultaneously. In this study, we chose common language, then based on Physical Characteristics extracted from them, we present a text language classification algorithm for multi language document.Then we can extracted from this classes same features for char¬acter identification. Farsi and Arabic in class1, Chinese, Japanese and Korean in class2 and in English, Indonesian and Spanish are placed in Class 3. System must befor each line of the document, identify class it belongs to. The classifier used for classification is decision tree classifier structure with the daptive threshold levels.Surveydata are scanned document. The diagnosis is equal to 93.3 percent, which proves the effectiveness of the model presented.
Language:
Persian
Published:
Journal of Publishing, Volume:2 Issue: 5, 2013
Page:
21
magiran.com/p1363192  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!