Compute the gini index for the customer id attribute


Homework: Introduction To Data Mining

A. The following attributes are measured for members of a herd of Asian elephants: weight, height, tusk length, trunk length, and ear area. Based on these measurements, what sort of similarity measure (from 1st PDF) (measure of similarity and dissimilarity) would you use to compare or group these elephants? Justify your answer and explain any special circumstances.

B. Consider the training examples shown in 2nd PDF for a binary classification problem.

a. Compute the Gini index for the overall collection of training examples.
b. Compute the Gini index for the Customer ID attribute.
c. Compute the Gini index for the Gender attribute.
d. Compute the Gini index for the Car Type attribute using multiway split.

C. Consider the data set shown in Table 3rd PDF

a. Estimate the conditional probabilities for P(A|+), P(B|+), P(C|+), P(A|-), P(B|-), and P(C|-).

b. Use the estimate of conditional probabilities given in the previous question to predict the class label for a test sample (A = 0, B = 1, C = 0) using the naive Bayes approach.

c. Estimate the conditional probabilities using the m-estimate approach, with p = 1/2 and m = 4.

Format your homework according to the following formatting requirements:

(1) The answer should be typed, double spaced, using Times New Roman font (size 12), with one-inch margins on all sides.

(2) The response also includes a cover page containing the title of the homework, the student's name, the course title, and the date. The cover page is not included in the required page length.

(3) Also include a reference page. The Citations and references should follow APA format. The reference page is not included in the required page length.

Attachment:- Questions-Data.rar

Solution Preview :

Prepared by a verified Expert
Database Management System: Compute the gini index for the customer id attribute
Reference No:- TGS03034712

Now Priced at $40 (50% Discount)

Recommended (90%)

Rated (4.3/5)