Building Semantic Kernel for Persian Text Classification with a Small Amount of Training Data

Author(s):

Amir H. Jadidinejad , Venus Marza

Message:

Abstract:

The original idea of semantic kernels is to use semantic features instead of terms appeared in the text document. In this article, the documents are transformed into a new k-dimensional feature space by applying Singular Value Decomposition on the Term-Document matrix and extracting 𝑘 eigenvectors with higher energy. The suggested semantic kernel causes severe reduction of dimensions which leads to two main conclusions. First, the computational complexity of the classifier is severely reduced. Second, the trained classifier has less sensitivity on the input terms; therefore, it can classify documents effectively. Experiments on Persian documents indicate the absolute superiority of the suggested semantic kernel in comparison to well-known vector space (Bag-of-Words) kernel, especially under the circumstances in which external semantic resources are not available and the amount of available training data is not sufficient.

Keywords:

Semantic Kernel , Vector Space Kernel , Support Vector Machine , Dimensionality Reduction , Text Classification

Language:

English

Published:

Journal of Advances in Computer Research, Volume:6 Issue: 1, Winter 2015

Pages:

125 to 136

https://www.magiran.com/p1374082

دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:

اشتراک شخصی

با ثبت ایمیلتان و پرداخت حق اشتراک سالانه به مبلغ 1,950,000 ريال، بلافاصله متن این مقاله را دریافت کنید.اعتبار دانلود 70 مقاله نیز در حساب کاربری شما لحاظ خواهد شد.

پرداخت حق اشتراک به معنای پذیرش "شرایط خدمات" پایگاه مگیران از سوی شماست.

پست الکترونیکی

اگر مقاله ای از شما در مگیران نمایه شده، برای استفاده از اعتبار اهدایی سامانه نویسندگان با ایمیل منتشرشده ثبت نام کنید. ثبت نام

اشتراک سازمانی

به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!

اطلاعات بیشتر ثبت نام با ایمیل دانشگاهی/سازمانی

توجه!

حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.

In order to view content subscription is required

Personal subscription

Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.

Organization subscription

Please contact us to subscribe your university or library for unlimited access!

More information

توقف انتشار

Journal of Advances in Computer Research

فصلنامه به زبان انگلیسی

آخرین شماره | آرشیو

ISSN: 2345-606X

انتشار این نشریه متوقف شده‌است.

صاحب امتیاز:

دانشگاه آزاد اسلامی واحد ساری

مدیر مسئول:

دکتر همایون موتمنی

سردبیر:

دکتر علی موقر رحیم آبادی

تلفن نشریه: ۰۱۱-۳۳۱۷۵۳۵۹

اطلاعات بیشتر نشریه

درباره نشریه

به جمع مشترکان مگیران بپیوندید!

Building Semantic Kernel for Persian Text Classification with a Small Amount of Training Data

Amir H. Jadidinejad , Venus Marza

Semantic Kernel , Vector Space Kernel , Support Vector Machine , Dimensionality Reduction , Text Classification

Journal of Advances in Computer Research

Journal of Advances in Computer Research