Draw the box-plots for age and fatnbsp interpret the


Problem 1:  This problem is an example of data preprocessing needed in a data mining process.  

Suppose that a hospital tested the age and body fat data for 18 randomly selected adults with the following results:

Age

23

23

27

27

39

41

47

49

50

%fat

9.5

26.5

7.8

17.8

31.4

25.9

27.4

27.2

31.2

Age

52

54

54

56

57

58

58

60

61

%fat

34.6

42.5

28.8

33.4

30.2

34.1

32.9

41.2

35.7

a. Draw the box-plots for age and %fat.  Interpret the distribution of the data

b. Normalize the two attributes based on z-score normalization.

c. Regardless of the original ranges of the variables, normalization techniques transform the data into new ranges that allow to compare and use variables on the same scales. What are the values ranges of the following normalization methods? Explain your answer.

i. Min-max normalization

ii. Z-score normalization

iii. Normalization by decimal scaling.

d. Draw a scatter-plot based on the two variables and interpret the relationship between the two variables.

e. Calculate the correlation coefficient. Are these two attributes positively or negatively correlated? Compute the covariance matrix.

Problem 2:  This problem is an example of data preprocessing needed in a data mining process.  

Suppose a group of 12 sales price records has been sorted as follows:

5, 10, 11, 13, 15, 35, 50,55,72,92,204,215

Partition them into bins by each of the following method, smooth the data and interpret the results:

a. equal-depth partitioning with 3 values per bin

b. equal-width partitioning with 3 bins

Problem 3 a) Figure 1 illustrates the plots for some data with respect to two variables: balance and employment status. If you have to select one of these two variables to classify the data into two classes (circle class and plus class), which one would you select? Is there any approach/criterion that you can use to support your selection? Explain your answer.

822_Figure.png

Figure 1: Data Plots for Problem 3.a.

b) For the data in Figure 2 with three variables and two classes: which variable you would choose to classify the data? Show all the steps of your calculations and interpret your answer.

139_Figure1.png

Figure 2: Data for Problem 3.b

Request for Solution File

Ask an Expert for Answer!!
Engineering Mathematics: Draw the box-plots for age and fatnbsp interpret the
Reference No:- TGS01381533

Expected delivery within 24 Hours