One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholars.
The impact of big data on information analysis can be traced to four different parts. The first part is data extraction and processing, the second part is data analysis, the third part is data storage, and finally the visualization of the data. In the field of big data processing, in various studies, different categories have been presented. For example, in the studies of Hashim et al., big data processing is divided into two categories. These two types are: batch and real time. These two categories of processing, which nowadays are standard in any comprehensive big data solution, also have been introduced in Abawajy studies: batch processing is related to offline processing, and real-time processing is usually used to analyze the streaming data without any need to storage of data on disk. As data flows from various sources, the data is analyzed and processed real time, for immediate insight. As today's world is rapidly changing and survival in today's competitive world requires instant decision-making based on flows of data, streaming data analysis is becoming increasingly important.
On the other hand, one of the great valuable sources of streaming data is the data generated by social networks’ users such as Twitter. Social networks data sources are very rich sources for analysis as they come from the opinions and opinions of their users.
As discussed earlier, and since previous studies such as Flash's studies have focused more on batch analysis (offline data), this study has attempted to investigate a variety of tools and infrastructures related to big streaming data, and finally design a real-time dashboard based on Twitter social network streaming data.
The following article addresses two research questions: 1) How to design and implement a real-time dashboard based on social networks data? 2) Which different configurations are best suited for real-time dashboard analysis and visualization?
In other words, the purpose of this article is to provide a solution for extracting and visualizing Twitter's social network streaming data by deleting databases, as an examples of big data real time analysis. In this research, we used Twitter streaming data as an input, Apache Storm as a processing platform and D3.js as a visualization tool.
Finally, the designed dashboard was evaluated using Design of Experiment method and other statistical tests in various types of Apache Storm configurations and eventually it was proved that the dashboard is real time with an average response time for 1 minute and 30 seconds.
Type of Study:
Research |
Subject:
Paper Received: 2018/01/4 | Accepted: 2019/10/5 | Published: 2020/04/20 | ePublished: 2020/04/20