Project topic the election of 2016 seems to be a very


Project Proposal -

Objectives: To get more familiar with the Hadoop Ecosystem (Spark specifically) and Scala Programming.

Resources: Files of tweets about election

Project Topic: The election of 2016 seems to be a very controversial one. The topics about both camps just keeps increasing. Both camps were trying their best to gain as many votes as possible. As for voters, they have expressed their opinion online as well. The purpose of this project is to find the hash tags about the election of 2016 and find what's the hottest trending across the specific period.

Feedback from instructor: xxx,

This is an OK start, but you will need to go far beyond just looking at hashtags in tweets, since that is comparable to a homework assignment, and the project should be 10 times as big.

The tweets are in: hadoopfs -ls /ss/json/*

Final Paper Requirements

  • 10-15 page paper (+ code)
  • Due on last day of class, hardcopy
  • Should cover
  • data source
  • data description and schema
  • data pre-processing required (parsing, filtering, etc.)
  • any bad data issues
  • your Spark algorithm
  • description of any other ecosystem or additional tools
  • output description
  • how did you verify that your output is correct?
  • performance/scale characteristics
  • what would you have done differently if you did this again?
  • conclusions.

Request for Solution File

Ask an Expert for Answer!!
Other Engineering: Project topic the election of 2016 seems to be a very
Reference No:- TGS01708178

Expected delivery within 24 Hours