Connected Component Based Word Spotting on Persian Handwritten image documents

Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-uments using attribute-based classi cation and label-embedding. For this purpose, a hierarchicalframework is proposed, in which at rst, the candidate are selected based on connected compo-nents(CCs) sequence. Then, the query word is segmented to constructor CCs, and similar CCs countin the candidate region of document are selected based on their distances to the CCs count of thequery word. As a result, the candidate regions are extracted. In the nal phase, the query wordis located only in the candidate regions of the document. A well known Persian handwritten textdataset, namely FTH, is chosen as a benchmark for the presented method. The results shows thatthe proposed method outperforms the state-of-the-art methods, 81.02 percent for unseen word classretrieval.

Article Type:
Research/Original Article
International Journal Of Nonlinear Analysis And Applications, Volume:10 Issue:2, 2019
11 - 21  
روش‌های دسترسی به متن این مطلب
اشتراک شخصی
در سایت عضو شوید و هزینه اشتراک یک‌ساله سایت به مبلغ 300,000ريال را پرداخت کنید. همزمان با برقراری دوره اشتراک بسته دانلود 100 مطلب نیز برای شما فعال خواهد شد!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی همه کاربران به متن مطالب خریداری نمایند!