Itech 2201 cloud computing - what is the value of


Part A

Exercise 1: Data Science

Read the article at https://datascience.berkeley.edu/about/what-is-data-science/ and answer the following:

What is Data Science?

According to IBM estimation, what is the percent of the data in the world today that has been created in the past two years?

What is the value of petabytestorage?

For each course, both foundation and advanced, you find at https://datascience.berkeley.edu/academics/curriculum/briefly state (in 2 to 3 lines) what they offer?Based on the given course description as well as from the video.

Exercise 2: Characteristics of Big Data

Read the following research paper from IEEE Xplore Digital Library
Ali-ud-din Khan, M.; Uddin, M.F.; Gupta, N., "Seven V's of Big Data understanding Big Data to extract value," American Society for Engineering Education (ASEE Zone 1), 2014 Zone 1 Conference of the , pp.1,5, 3-5 April 2014
and answer the following questions:

Summarise the motivation of the author (in one paragraph)

What are the 7 v's mentioned in the paper? Briefly describe each V in one paragraph.

Explore the author's future work by using the reference [4] in the research paper. Summarise your understanding how Big Data can improvise healthcare sector in 300 words.

Exercise 3: Big Data Platform

In order to build a big data platform one has to acquire, organize and analyse the big data. Go through the following links and answer the questions that follow the links:
- https://www.infochimps.com/infochimps-cloud/how-it-works/
- https://www.youtube.com/watch?v=TfuhuA_uaho
- https://www.youtube.com/watch?v=IC6jVRO2Hq4
- https://www.youtube.com/watch?v=2yf_jrBhz5w
Please note: You are encouraged to watch all the videos in that series from Oracle.
How to acquire big data for enterprises and how it can be used?

How to organize and handle the big data?

What are the analyses that can be done using big data?

Part B

Part B answers should be based on well cited article/videos - name the references used in your answer.For more information read the guidelines as given in Assignment 1.

Exercise 4: Big Data Products

Google is a master at creating data products. Below are few examples from Google. Describe the below products and explain how the large scale data is used effectively in these products.
a. Google's PageRank

b. Google's Spell Checker

c. Google's Flu Trends

d. Google's Trends

Like Google - Facebook and LinkedIn also uses large scale data effectively. How?

Exercise 5: Big Data Tools

Briefly explain why a traditional relational database (RDBS) is not effectively used to store big data?

What is NoSQL Database?

Name and briefly describe at least 5 NoSQL Databases

What is MapReduce and how it works?

Briefly describe some notable MapReduce products (at least 5)

Amazon's S3 service lets to store large chunks of data on an online service. List some 5 features for Amazon's S3 service.

Getting the concise, valuable information from a sea of data can be challenging. We need statistical analysis tool to deal with Big Data. Name and describe some (at least 3) statistical analysis tools.

Exercise 6: Big Data Application

Name 3 industries that should use Big Data - justify your claim in 250 words for each industry using proper references.

Request for Solution File

Ask an Expert for Answer!!
Database Management System: Itech 2201 cloud computing - what is the value of
Reference No:- TGS01579640

Expected delivery within 24 Hours