Big Data Analysis in Healthcare: Apache Hadoop, Apache spark and Apache Flink

Author(s):

Elham Nazari , Mohammad Hasan Shahriari , Hamed Tabesh*

Message:

Article Type:

Review Article (بدون رتبه معتبر)

Abstract:

Introduction

Health care data is increasing. The correct analysis of such data will improve the quality of care and reduce costs. This kind of data has certain features such as high volume, variety, high-speed production, etc. It makes it impossible to analyze with ordinary hardware and software platforms. Choosing the right platform for managing this kind of data is very important.
The purpose of this study is to introduce and compare the most popular and most widely used platform for processing Big Data, Apache Hadoop MapReduce, and the two Apache Spark and Apache Flink platforms, which have recently been featured with great prominence.

Material and Methods

This study is a survey whose content is based on the subject matter search of the Proquest, PubMed, Google Scholar, Science Direct, Scopus, IranMedex, Irandoc, Magiran, ParsMedline and Scientific Information Database (SID) databases, as well as Web reviews, specialized books with related keywords and standard. Finally, 80 articles related to the subject of the study were reviewed.

Results

The findings showed that each of the studied platforms has features, such as data processing, support for different languages, processing speed, computational model, memory management, optimization, delay, error tolerance, scalability, performance, compatibility, Security and so on. Overall, the findings showed that the Apache Hadoop environment has simplicity, error detection, and scalability management based on clusters, but because its processing is based on batch processing, it works for slow complex analyzes and does not support stream processing, Apache Spark is also distributed as a computational platform that can process a Big Data set in memory with a very fast response time, the Apache Flink allows users to store data in memory and load them multiple times and provide a complex Fault Tolerance mechanism Continuously retrieves data stream status.

Conclusion

The application of Big Data analysis and processing platforms varies according to the needs. In other words, it can be said that each technology is complementary, each of which is applicable in a particular field and cannot be separated from one another and depending on the purpose and the expected expectation, and the platform must be selected for analysis or whether custom tools are designed on these platforms.

Keywords:

Big Data Analysis , Apache Hadoop , Apache Spark , Apache Flink , Healthcare

Language:

English

Published:

Frontiers in Health Informatics, Volume:8 Issue: 1, 2019

Page:

magiran.com/p2057138

دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:

اشتراک شخصی

با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!

اشتراک سازمانی

به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!

اطلاعات بیشتر

توجه!

حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.

دسترسی سراسری کاربران دانشگاه پیام نور!

اعضای هیئت علمی و دانشجویان دانشگاه پیام نور در سراسر کشور، در صورت ثبت نام با ایمیل دانشگاهی، تا پایان فروردین ماه 1403 به مقالات سایت دسترسی خواهند داشت!

In order to view content subscription is required

Personal subscription

Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.

Organization subscription

Please contact us to subscribe your university or library for unlimited access!

More information

Frontiers in Health Informatics

سالنامه حوزه پزشکی و پیراپزشکی - مدیریت فن آوری سلامت به زبان انگلیسی

آخرین شماره | آرشیو

eISSN: 2676-7104

تا سال 2018 با نام Iranian Journal of Medical Informatics منتشر شده است.

صاحب امتیاز ، مدیرمسئول و سردبیر:

دکتر مصطفی لنگری زاده

تلفن نشریه: ۰۲۱-۴۴۶۷۰۷۴۰

اطلاعات بیشتر نشریه

درباره نشریه پیام به نشریه سایت اختصاصی نشریه پذیرش الکترونیکی مقاله راهنمای نویسندگان

سامانه نویسندگان

Corresponding Author (3)

Tabesh, Hamed

استادیار

اطلاعات نویسنده(گان) توسط ایشان ثبت و تکمیل شده‌است. برای مشاهده مشخصات و فهرست همه مطالب، صفحه رزومه را ببینید.

به جمع مشترکان مگیران بپیوندید!

Big Data Analysis in Healthcare: Apache Hadoop, Apache spark and Apache Flink

Elham Nazari , Mohammad Hasan Shahriari , Hamed Tabesh*

Big Data Analysis , Apache Hadoop , Apache Spark , Apache Flink , Healthcare

Frontiers in Health Informatics

Frontiers in Health Informatics