Identify clusters and describe centroid and business meaning


Problem

Use of the cluster analysis in predictive analytics can be described as follows:

Using existing data we can identify clusters (groups) in data. Each cluster may be described in data terms (using cluster centroids etc.), and each cluster can be explant in terms of its business meaning.

Then when a new data arrives it can be tested to identify which cluster is the closest, which will suggest that the new data belongs to this cluster.

Directions:

A. Pick any dataset relevant to your major that you would like to analyze. Avoid use the same or similar datasets to one you use for your final project.

B. Randomly divide it into two chunks 80% and 20% of records.

C. Select input variables (2 min) that you will use for cluster analysis. Provide reasoning for the selection.

D. Use SPSS or other tool to apply appropriate cluster analysis method to clusters the larger part of the dataset.

E. Identify clusters and describe their centroids and business meaning.

F. If classes are poorly identified by the analysis or their business meaning is hard to describe. Change your variable selection and go to the step C.

G. For at least 5 records from the remaining smaller part of the dataset identify the closest cluster centroid. That will be a prediction which cluster those records belong too. Note that they have not been used in cluster identification, therefore this prediction will qualify as an example of predictive analytics.

H. Submit a Word report describing each step and a result of this process, include relevant scripts and outputs produced by the tool you use.

Request for Solution File

Ask an Expert for Answer!!
Computer Engineering: Identify clusters and describe centroid and business meaning
Reference No:- TGS03312241

Expected delivery within 24 Hours