Out of the three pre-requisite data science skills database


Problem I (Answer each piece in 75-150 words with reference but do not quote)

What is data mining? In your answer, address the following:

- Is it another fad?

- Out of the three pre-requisite data science skills (database management, statistics, and machine learning) which one(s) are most important to master?

- Explain how the evolution of database technology led to data mining.

- Describe the steps involved in data mining when viewed as a process of knowledge discovery.

Problem II

Robust data loading poses a challenge in database systems because the input data are often dirty. In many cases, an input record may have several missing values and some records could be contaminated (i.e., with some data values out of range or of a different data type than expected).

Work out a step-by-step data cleaning and loading procedure so that the erroneous data will be marked and contaminated data will not be mistakenly inserted into the database during data loading.

Problem III --(Answer 75-100 words with reference but do not quote)

Outline the major steps of decision tree classification.

Problem IV --(Answer each piece in 75-100 words with reference but do not quote)

a. Compare the advantages and disadvantages of eager classification (e.g., Decision tree, Bayesian, neural network) versus lazy classification (e.g., k-nearest neighbor, case based reasoning).

b. Create a hypothetical example for one of the classifiers discussed in part a.

Problem V -(Answer each piece in 75-100 words with reference but do not quote)

Association rule mining often generates a large number of rules. Name at least one effective method that can be used to reduce the number of rules generated while still preserving most of the interesting rules.

Problem VI

You are a consultant working for the company "Data Mining R Us." Your client is a major luxury automobile manufacturer, Lexcedes. They have come up with a brand-new model called the "Chimera" and they want to target the car for young, filthy rich individuals.

Besides having their own company databases, Lexcedes purchased a large collection of databases containing historic information about people, their attributes, and what they buy. They want to use data mining to help sell their new model.

Describe in detail a comprehensive step-by-step data mining procedure you would follow if you were given this task. Make sure that your answer reflects the situation stated above (in other words, do not give a generic answer). State your assumptions.

Request for Solution File

Ask an Expert for Answer!!
Management Information Sys: Out of the three pre-requisite data science skills database
Reference No:- TGS02938998

Expected delivery within 24 Hours