Analyzing social media data and performance comparison with traditional database, data warehouse, and MapReduce approaches

Thumbnail Image
Date
2020-01-01
Authors
Xu, Wei
Major Professor
Joseph Zambreno
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Authors
Research Projects
Organizational Units
Organizational Unit
Electrical and Computer Engineering

The Department of Electrical and Computer Engineering (ECpE) contains two focuses. The focus on Electrical Engineering teaches students in the fields of control systems, electromagnetics and non-destructive evaluation, microelectronics, electric power & energy systems, and the like. The Computer Engineering focus teaches in the fields of software systems, embedded systems, networking, information security, computer architecture, etc.

History
The Department of Electrical Engineering was formed in 1909 from the division of the Department of Physics and Electrical Engineering. In 1985 its name changed to Department of Electrical Engineering and Computer Engineering. In 1995 it became the Department of Electrical and Computer Engineering.

Dates of Existence
1909-present

Historical Names

  • Department of Electrical Engineering (1909-1985)
  • Department of Electrical Engineering and Computer Engineering (1985-1995)

Related Units

Journal Issue
Is Version Of
Versions
Series
Abstract

Data warehouse, OLAP technology and distributed analysis show great potential in improving business analysis, tendency prediction and decision making. With the assistance of data mining techniques, databases can also be a useful tool for analyzing societal trends by gathering data from social media networks. As these networks can contain huge amounts of text data, it can serve as a perfect platform for testing text mining technologies, and discovering what kind of trend or what kind of topic concern people the most during a certain time period. This project utilizes a data set of tweets generated from May to June 2019, which contains more than 2 million tweets with content and location data. After applying some data cleaning techniques, we were able to establish a data cube and provide various analyses based on location. Our results show Twitter users' preference and use frequency varies significantly based on their locations. Ultimately, this project provides a case study about utilizing database, data warehouse and distributed analysis technology to analyze social media, and provides some insight regarding trending topics of interest. This work could be applied by those interested in gaining a better understanding of social media users.

Comments
Description
Keywords
Citation
DOI
Source
Copyright
Wed Jan 01 00:00:00 UTC 2020