What is the percentage of tracts that has median value


Problem

In R, perform the following steps on the Boston housing dataset. To identify high priced tracts, we would like to perform classification on new tracts in either median value above $30,000 (CAT.MEDV = 1) or not (CAT.MEDV = 0), based on information such as crime rate, pollution, and number of rooms.

i. Step I: Collecting data (Discuss the business problem and how the data can support the analytics.)

ii. Step II: Exploring data and preparing data (Use all records from the dataset and use all predictors. Make factor categorical variable as needed. Create training partition (60%) and testing partition (40%) with randomized partitioning. Set a seed so the results can be reproduced. What is the percentage of tracts that has median value above $30,000? How about this percentage in training and testing partitions, respectively?)

iii. Step III: Training a model on the data (Make a classification tree model with C5.0() for the classification tree algorithm. Remember to fit the model on the training data. What is the tree size? Get a plot of the tree.)

iv. Step IV: Evaluating model performance (Evaluate the model performance against the test data with confusion matrix. Identify accuracy, sensitivity, and specificity.)

v. Step V: Improving model performance (Make a classification tree model with rpart() algorithm. What is the tree size? Get a plot of the tree. Compare this model with the original model performance and identify which one is better to serve the business goal for this case study.) Code should be in R programming.

Request for Solution File

Ask an Expert for Answer!!
Computer Engineering: What is the percentage of tracts that has median value
Reference No:- TGS03312405

Expected delivery within 24 Hours