Actor Double Critic Architecture for Dialogue System

Author(s):

Y. Saffari , J. Salimi Sartakhti *

Message:

Article Type:

Research/Original Article (دارای رتبه معتبر)

Abstract:

Background and Objectives

Most of the recent dialogue policy learning ‎methods are based on reinforcement learning (RL). However, the basic RL ‎algorithms like deep Q-network, have drawbacks in environments with ‎large state and action spaces such as dialogue systems. Most of the ‎policy-based methods are slow, cause of the estimating of the action value ‎using the computation of the sum of the discounted rewards for each ‎action. In value-based RL methods, function approximation errors lead to ‎overestimation in value estimation and finally suboptimal policies. There ‎are works that try to resolve the mentioned problems using combining RL ‎methods, but most of them were applied in the game environments, or ‎they just focused on combining DQN variants. This paper for the first time ‎presents a new method that combines actor-critic and double DQN named ‎Double Actor-Critic (DAC), in the dialogue system, which significantly ‎improves the stability, speed, and performance of dialogue policy learning. ‎

Methods

In the actor critic to overcome the slow learning of normal DQN, ‎the critic unit approximates the value function and evaluates the quality ‎of the policy used by the actor, which means that the actor can learn the ‎policy faster. Moreover, to overcome the overestimation issue of DQN, ‎double DQN is employed. Finally, to have a smoother update, a heuristic ‎loss is introduced that chooses the minimum loss of actor-critic and ‎double DQN. ‎

Results

Experiments in a movie ticket booking task show that the ‎proposed method has more stable learning without drop after ‎overestimation and can reach the threshold of learning in fewer episodes ‎of learning. ‎

Conclusion

Unlike previous works that mostly focused on just proposing ‎a combination of DQN variants, this study combines DQN variants with ‎actor-critic to benefit from both policy-based and value-based RL methods ‎and overcome two main issues of both of them, slow learning and ‎overestimation. Experimental results show that the proposed method can ‎make a more accurate conversation with a user as a dialogue policy ‎learner.‎

Keywords:

Actor-Critic , Dialogue system , DQN , Actor Double Critic

Language:

English

Published:

Journal of Electrical and Computer Engineering Innovations, Volume:11 Issue: 2, Summer-Autumn 2023

Pages:

363 to 372

magiran.com/p2585768

دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:

اشتراک شخصی

با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!

اشتراک سازمانی

به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!

اطلاعات بیشتر

توجه!

حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.

In order to view content subscription is required

Personal subscription

Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.

Organization subscription

Please contact us to subscribe your university or library for unlimited access!

More information

علمی مصوب

Journal of Electrical and Computer Engineering Innovations

مجله ی نوآوری های مهندسی برق و کامپیوتر

دوفصلنامه فنی مهندسی به زبان انگلیسی

Electrical and Computer Engineering Innovations

آخرین شماره | آرشیو

ISSN: 2322-3952

صاحب امتیاز:

دانشگاه تربیت دبیر شهید رجایی

مدیر مسئول:

دکتر سعید علیایی

سردبیر:

رضا ابراهیم پور

تلفن نشریه: ۰۲۱-۲۲۹۷۰۰۶۰ (داخلی 2234)

اطلاعات بیشتر نشریه

درباره نشریه پیام به نشریه سایت اختصاصی نشریه پذیرش الکترونیکی مقاله

به جمع مشترکان مگیران بپیوندید!

Actor Double Critic Architecture for Dialogue System

Y. Saffari , J. Salimi Sartakhti *

Actor-Critic , Dialogue system , DQN , Actor Double Critic

Journal of Electrical and Computer Engineering Innovations

مجله ی نوآوری های مهندسی برق و کامپیوتر