Online Learning for Imbalanced Data Streams with Concept Drift by Belief Theory and Chaotic Function

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:

Continual learning from data streams is a pivotal aspect of machine learning, requiring the development of algorithms capable of adapting to incoming data. However, the ongoing evolution of data streams presents a formidable challenge as previously acquired knowledge may become outdated. This challenge, known as concept drift, demands timely detection for the effective adaptation of learning models. While various drift detectors have been proposed, they often assume a relatively balanced class distribution. In scenarios with imbalanced data streams, these detectors may exhibit bias toward majority classes, overlooking shifts in minority classes. Moreover, the imbalance among classes can change over time, with roles shifting between majority and minority classes, especially when relationships among classes become complex due to overlapping regions. In this paper, a novel classification method is introduced for imbalanced streaming data affected by concept drift. The proposed method continuously monitors arriving streams to detect and adapt to both imbalances and concept drift. Upon receiving a new block of data, the proposed method employs the k-means clustering approach to identify non-dense regions and performs oversampling for minority classes. Cluster centers are selected using the belief function to address overlapping issues between majority and minority classes. Utilizing a chaotic approach, the new sample is added based on its neighborhood and the size of that neighborhood. Subsequently, concept drift detection is conducted using three pre-defined thresholds that cover time intervals and classification errors. Finally, the label prediction process is done by ensemble learning and weighted majority voting. Experiments conducted on benchmark datasets from the UCI database evaluate the performance of the proposed method using Leave-One-Out (LOO) validation and comparisons with state-of-the-art methods. The results demonstrate the superiority of the proposed method across various evaluation criteria, highlighting its effectiveness in addressing imbalanced streaming data with concept drift.

Language:
Persian
Published:
Signal and Data Processing, Volume:20 Issue: 4, 2024
Pages:
23 to 34
https://www.magiran.com/p2710841  
سامانه نویسندگان
  • Corresponding Author (1)
    Javad Hamidzadeh
    Associate Professor Computer Engineering, Sajad University of Technology, Mashhad, Iran
    Hamidzadeh، Javad
اطلاعات نویسنده(گان) توسط ایشان ثبت و تکمیل شده‌است. برای مشاهده مشخصات و فهرست همه مطالب، صفحه رزومه را ببینید.
مقالات دیگری از این نویسنده (گان)