Implement one executable hadoop mapreduce job


Assignment Task: Big Data

You are required to submit your work via the dedicated Unihub assignment link by the specified deadline. This link will 'timeout' at the submission deadline. Your work may not be accepted as an email attachment if you miss this deadline. Therefore, you are strongly advised to allow plenty of time to upload your work prior to the deadline.

You are required to solve the tasks illustrated below. Each task should be accompanied by:

A short introduction where you describe the problem and your high level solution. Your step-by-step process supported by screenshots. Each screenshot needs to be accompanied by a short explanatory text.

Eventually, if necessary, conclude each task with brief summary of what you have done.

Your submission needs to be unique

When solving your tasks, you are required to name your files by using your first name (e.g., if your name is Alice, you may name your task 1 file as) so to make your submission unique. Obviously, also your explanatory text needs to be unique.

Tasks:

Follow the lab instructions to install Apache Hadoop into a virtual server running on Linux Ubuntu Server. Once you have Apache Hadoop installed and running, execute the following tasks.

Task 1:

Implement one executable Hadoop MapReduce job that counts the total number of words having an even and odd number of characters. As an example, if the text in input is

Hello world, the output should be, because both and world contain an odd number of characters. Whereas, if the input us

My name is Alice the output should be.

The job needs to be executed by a mapper and a reducer. Both mapper and reducer needs to be written in Python and tested in Linux Ubuntu before running them on Hadoop MapReduce.

Task 2:

Implement one executable Hadoop MapReduce job that receives in input a .csv table having the structure 'StudentId, Module, Grade' and returns in output the minimum and maximum grade of each student along as her total number of modules she has passed.

Therefore, if your input is:

The job needs to be executed by a mapper and a reducer. Both mapper and reducer needs to be written in Python and tested in Linux Ubuntu before running them on Hadoop MapReduce.

Task 3:

Implement one executable Hadoop MapReduce job that receives in input two .csv tables having the structure:

User: UserId, Name, DOB

Follows: UserIdFollower, UserIdFollowing

The MapReduce job needs to perform the following SQL query:

Therefore, if the two original tables are:

The final table needs to be

The job needs to be executed by a mapper and a reducer. Both mapper and reducer needs to be written in Python and tested in Linux Ubuntu before running them on Hadoop MapReduce.

Get 100% Unique And Plagiarism Free Big Data Assignment Help By Apt Tutors For Great Success.

Tags: Big Data Assignment Help, Big Data Homework Help, Big Data Course Help, Big Data Solved Assignments, Applied Data Analytics Assignment Help, Applied Data Analytics Homework Help

Attachment:- Applied Data Analytics.rar

Request for Solution File

Ask an Expert for Answer!!
Computer Engineering: Implement one executable hadoop mapreduce job
Reference No:- TGS03045079

Expected delivery within 24 Hours