Document Type
Article
Publication Date
10-11-2019
Keywords
batch production systems, computer architecture, dataflow architectures, data stream architectures, distributed databases, distributed processing systems comparison, pipelines, real-time systems, taxonomy
Abstract
Big data processing systems are evolving to be more stream oriented where each data record is processed as it arrives by distributed and low-latency computational frameworks on a continuous basis. As the stream processing technology matures and more organizations invest in digital transformations, new applications of stream analytics will be identified and implemented across a wide spectrum of industries. One of the challenges in developing a streaming analytics infrastructure is the difficulty in selecting the right stream processing framework for the different use cases. With a view to addressing this issue, in this paper we present a taxonomy, a comparative study of distributed data stream processing and analytics frameworks, and a critical review of representative open source (Storm, Spark Streaming, Flink, Kafka Streams) and commercial (IBM Streams) distributed data stream processing frameworks. The study also reports our ongoing study on a multilevel streaming analytics architecture that can serve as a guide for organizations and individuals planning to implement a real-time data stream processing and analytics framework.
Faculty
Sheridan Research
Journal
IEEE Access
Volume
7
First Page
154300
Last Page
154316
Version
Publisher's version
Peer Reviewed/Refereed Publication
yes
Copyright
© Haruna Isah, Tariq Abughofa, Sazia Mahfuz, Dharmitha Ajerla, Farhana Zulkernine & Shahzad Khan 2019
Terms of Use
Terms of Use for Works posted in SOURCE.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Original Publication Citation
Isah, H., Abughofa, T., Mahfuz, S., Ajerla, D., Zulkernine, F., and Khan, S. (2019). A Survey of Distributed Data Stream Processing Frameworks. IEEE Access, 7, 154300-154316. https://doi.org/10.1109/ACCESS.2019.2946884
SOURCE Citation
Isah, Haruna; Abughofa, Tariq; Mahfuz, Sazia; Ajerla, Dharmitha; Zulkernine, Farhana; and Khan, Shahzad, "A Survey of Distributed Data Stream Processing Frameworks" (2019). Publications and Scholarship. 7.
https://source.sheridancollege.ca/centres_publications/7
Included in
Computer and Systems Architecture Commons, Computer Sciences Commons, Electrical and Electronics Commons